What effect does the “-d” option have with diff?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
23
down vote

favorite
2












The diff implementation on OpenBSD has a non-standard -d option with the following documentation:




-d



Try very hard to produce a diff as small as possible. This may
consume a lot of processing power and memory when processing
large files with many changes.




The GNU diff implementation has the same option with the shorter documentation




-d, --minimal



try hard to find a smaller set of changes




From time to time I've used this option just to see if it generates output that is in any shape or form different from the same diff command without the option, but I've never seen any difference (no pun intended).



Could someone provide or point to an example where this option actually produces a different result from the same command without -d? Alternatively, if someone could explain the circumstances required for this option to kick in. I'm also uncertain whether "minimal" means "fewer lines of output" or "fewer hunks".



An uneducated guess is that it has to do with very large hunks.










share|improve this question



















  • 1




    unix.stackexchange.com/questions/472528 piqued your curiosity did it? (-:
    – JdeBP
    Oct 1 at 9:54










  • @JdeBP Yes indeed. It reminded me about this flag and the fact that I simply don't know what it does since I've never seen it do anything.
    – Kusalananda
    Oct 1 at 9:56






  • 1




    info diff performance explains it IIRC
    – Stéphane Chazelas
    Oct 1 at 10:28






  • 1




    Clearly related. Sadly no example of myers --> minimal results.
    – Isaac
    Oct 2 at 6:36






  • 1




    I would really like to get an example that would create different output with gdiff -d in order to check whether the additions to OpenBSD are useful. From my tests, I could not get any differences but it is obvious that the OpenBSD code slows down the performance which looks like a significant impact, since the diff Algorithm from Douglas McIlroy is faster than gdiff as long as you use normal file sizes.
    – schily
    Oct 2 at 13:21














up vote
23
down vote

favorite
2












The diff implementation on OpenBSD has a non-standard -d option with the following documentation:




-d



Try very hard to produce a diff as small as possible. This may
consume a lot of processing power and memory when processing
large files with many changes.




The GNU diff implementation has the same option with the shorter documentation




-d, --minimal



try hard to find a smaller set of changes




From time to time I've used this option just to see if it generates output that is in any shape or form different from the same diff command without the option, but I've never seen any difference (no pun intended).



Could someone provide or point to an example where this option actually produces a different result from the same command without -d? Alternatively, if someone could explain the circumstances required for this option to kick in. I'm also uncertain whether "minimal" means "fewer lines of output" or "fewer hunks".



An uneducated guess is that it has to do with very large hunks.










share|improve this question



















  • 1




    unix.stackexchange.com/questions/472528 piqued your curiosity did it? (-:
    – JdeBP
    Oct 1 at 9:54










  • @JdeBP Yes indeed. It reminded me about this flag and the fact that I simply don't know what it does since I've never seen it do anything.
    – Kusalananda
    Oct 1 at 9:56






  • 1




    info diff performance explains it IIRC
    – Stéphane Chazelas
    Oct 1 at 10:28






  • 1




    Clearly related. Sadly no example of myers --> minimal results.
    – Isaac
    Oct 2 at 6:36






  • 1




    I would really like to get an example that would create different output with gdiff -d in order to check whether the additions to OpenBSD are useful. From my tests, I could not get any differences but it is obvious that the OpenBSD code slows down the performance which looks like a significant impact, since the diff Algorithm from Douglas McIlroy is faster than gdiff as long as you use normal file sizes.
    – schily
    Oct 2 at 13:21












up vote
23
down vote

favorite
2









up vote
23
down vote

favorite
2






2





The diff implementation on OpenBSD has a non-standard -d option with the following documentation:




-d



Try very hard to produce a diff as small as possible. This may
consume a lot of processing power and memory when processing
large files with many changes.




The GNU diff implementation has the same option with the shorter documentation




-d, --minimal



try hard to find a smaller set of changes




From time to time I've used this option just to see if it generates output that is in any shape or form different from the same diff command without the option, but I've never seen any difference (no pun intended).



Could someone provide or point to an example where this option actually produces a different result from the same command without -d? Alternatively, if someone could explain the circumstances required for this option to kick in. I'm also uncertain whether "minimal" means "fewer lines of output" or "fewer hunks".



An uneducated guess is that it has to do with very large hunks.










share|improve this question















The diff implementation on OpenBSD has a non-standard -d option with the following documentation:




-d



Try very hard to produce a diff as small as possible. This may
consume a lot of processing power and memory when processing
large files with many changes.




The GNU diff implementation has the same option with the shorter documentation




-d, --minimal



try hard to find a smaller set of changes




From time to time I've used this option just to see if it generates output that is in any shape or form different from the same diff command without the option, but I've never seen any difference (no pun intended).



Could someone provide or point to an example where this option actually produces a different result from the same command without -d? Alternatively, if someone could explain the circumstances required for this option to kick in. I'm also uncertain whether "minimal" means "fewer lines of output" or "fewer hunks".



An uneducated guess is that it has to do with very large hunks.







diff






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Oct 2 at 5:05

























asked Oct 1 at 9:23









Kusalananda

108k14210333




108k14210333







  • 1




    unix.stackexchange.com/questions/472528 piqued your curiosity did it? (-:
    – JdeBP
    Oct 1 at 9:54










  • @JdeBP Yes indeed. It reminded me about this flag and the fact that I simply don't know what it does since I've never seen it do anything.
    – Kusalananda
    Oct 1 at 9:56






  • 1




    info diff performance explains it IIRC
    – Stéphane Chazelas
    Oct 1 at 10:28






  • 1




    Clearly related. Sadly no example of myers --> minimal results.
    – Isaac
    Oct 2 at 6:36






  • 1




    I would really like to get an example that would create different output with gdiff -d in order to check whether the additions to OpenBSD are useful. From my tests, I could not get any differences but it is obvious that the OpenBSD code slows down the performance which looks like a significant impact, since the diff Algorithm from Douglas McIlroy is faster than gdiff as long as you use normal file sizes.
    – schily
    Oct 2 at 13:21












  • 1




    unix.stackexchange.com/questions/472528 piqued your curiosity did it? (-:
    – JdeBP
    Oct 1 at 9:54










  • @JdeBP Yes indeed. It reminded me about this flag and the fact that I simply don't know what it does since I've never seen it do anything.
    – Kusalananda
    Oct 1 at 9:56






  • 1




    info diff performance explains it IIRC
    – Stéphane Chazelas
    Oct 1 at 10:28






  • 1




    Clearly related. Sadly no example of myers --> minimal results.
    – Isaac
    Oct 2 at 6:36






  • 1




    I would really like to get an example that would create different output with gdiff -d in order to check whether the additions to OpenBSD are useful. From my tests, I could not get any differences but it is obvious that the OpenBSD code slows down the performance which looks like a significant impact, since the diff Algorithm from Douglas McIlroy is faster than gdiff as long as you use normal file sizes.
    – schily
    Oct 2 at 13:21







1




1




unix.stackexchange.com/questions/472528 piqued your curiosity did it? (-:
– JdeBP
Oct 1 at 9:54




unix.stackexchange.com/questions/472528 piqued your curiosity did it? (-:
– JdeBP
Oct 1 at 9:54












@JdeBP Yes indeed. It reminded me about this flag and the fact that I simply don't know what it does since I've never seen it do anything.
– Kusalananda
Oct 1 at 9:56




@JdeBP Yes indeed. It reminded me about this flag and the fact that I simply don't know what it does since I've never seen it do anything.
– Kusalananda
Oct 1 at 9:56




1




1




info diff performance explains it IIRC
– Stéphane Chazelas
Oct 1 at 10:28




info diff performance explains it IIRC
– Stéphane Chazelas
Oct 1 at 10:28




1




1




Clearly related. Sadly no example of myers --> minimal results.
– Isaac
Oct 2 at 6:36




Clearly related. Sadly no example of myers --> minimal results.
– Isaac
Oct 2 at 6:36




1




1




I would really like to get an example that would create different output with gdiff -d in order to check whether the additions to OpenBSD are useful. From my tests, I could not get any differences but it is obvious that the OpenBSD code slows down the performance which looks like a significant impact, since the diff Algorithm from Douglas McIlroy is faster than gdiff as long as you use normal file sizes.
– schily
Oct 2 at 13:21




I would really like to get an example that would create different output with gdiff -d in order to check whether the additions to OpenBSD are useful. From my tests, I could not get any differences but it is obvious that the OpenBSD code slows down the performance which looks like a significant impact, since the diff Algorithm from Douglas McIlroy is faster than gdiff as long as you use normal file sizes.
– schily
Oct 2 at 13:21










1 Answer
1






active

oldest

votes

















up vote
15
down vote



accepted










In GNU diff, also used on FreeBSD, the --minimal flag triggers an algorithm variation by Paul Eggert that causes it "to limit the cost to O(N**1.5 log N) at the price of producing suboptimal output for large inputs with differences". More specifically, it causes it to not apply several heuristics that deal in finding merely close to optimal solutions and in throwing out "confusing" lines as extra differences.



In OpenBSD diff, which uses the older Unix diff algorithm from the 1970s, the algorithm employed is credited to Harold Stone, and the --minimal flag triggers a search that is (effectively un-) bounded by the maximum value of an unsigned integer instead of by the square root of the size of the range of lines being compared (or 256 if it is greater).



Further reading



  • Eugene W. Myers (November 1986). "An O(ND) difference algorithm and its variations". Algorithmica. Volume 1. Issue 1–4. pp. 251–266. DOI 10.1007/BF01840446.

  • J. W. Hunt and M. D. McIlroy (June 1976). "An Algorithm for Differential File Comparison". Report 41. Computing Science. Bell Laboratories.

  • Richard Hartman (1988-01-13). Unix diff(1) algorithm.
    23225@cca.CCA.COM. comp.unix.questions.

  • https://github.com/openbsd/src/blob/d1e24f318523607c98dc6fbe5a06a5d9e5c87293/usr.bin/diff/diffreg.c#L93

  • https://github.com/freebsd/freebsd/blob/40ec4fdc9a74bfdb83f13672acdb88af5c91ab46/contrib/diff/src/analyze.c#L23

  • Comprehensive review of diff algorithms, their history and implementations





share|improve this answer


















  • 1




    When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
    – schily
    Oct 1 at 10:50










  • Thanks for the references!
    – Kusalananda
    Oct 1 at 14:55










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472540%2fwhat-effect-does-the-d-option-have-with-diff%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
15
down vote



accepted










In GNU diff, also used on FreeBSD, the --minimal flag triggers an algorithm variation by Paul Eggert that causes it "to limit the cost to O(N**1.5 log N) at the price of producing suboptimal output for large inputs with differences". More specifically, it causes it to not apply several heuristics that deal in finding merely close to optimal solutions and in throwing out "confusing" lines as extra differences.



In OpenBSD diff, which uses the older Unix diff algorithm from the 1970s, the algorithm employed is credited to Harold Stone, and the --minimal flag triggers a search that is (effectively un-) bounded by the maximum value of an unsigned integer instead of by the square root of the size of the range of lines being compared (or 256 if it is greater).



Further reading



  • Eugene W. Myers (November 1986). "An O(ND) difference algorithm and its variations". Algorithmica. Volume 1. Issue 1–4. pp. 251–266. DOI 10.1007/BF01840446.

  • J. W. Hunt and M. D. McIlroy (June 1976). "An Algorithm for Differential File Comparison". Report 41. Computing Science. Bell Laboratories.

  • Richard Hartman (1988-01-13). Unix diff(1) algorithm.
    23225@cca.CCA.COM. comp.unix.questions.

  • https://github.com/openbsd/src/blob/d1e24f318523607c98dc6fbe5a06a5d9e5c87293/usr.bin/diff/diffreg.c#L93

  • https://github.com/freebsd/freebsd/blob/40ec4fdc9a74bfdb83f13672acdb88af5c91ab46/contrib/diff/src/analyze.c#L23

  • Comprehensive review of diff algorithms, their history and implementations





share|improve this answer


















  • 1




    When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
    – schily
    Oct 1 at 10:50










  • Thanks for the references!
    – Kusalananda
    Oct 1 at 14:55














up vote
15
down vote



accepted










In GNU diff, also used on FreeBSD, the --minimal flag triggers an algorithm variation by Paul Eggert that causes it "to limit the cost to O(N**1.5 log N) at the price of producing suboptimal output for large inputs with differences". More specifically, it causes it to not apply several heuristics that deal in finding merely close to optimal solutions and in throwing out "confusing" lines as extra differences.



In OpenBSD diff, which uses the older Unix diff algorithm from the 1970s, the algorithm employed is credited to Harold Stone, and the --minimal flag triggers a search that is (effectively un-) bounded by the maximum value of an unsigned integer instead of by the square root of the size of the range of lines being compared (or 256 if it is greater).



Further reading



  • Eugene W. Myers (November 1986). "An O(ND) difference algorithm and its variations". Algorithmica. Volume 1. Issue 1–4. pp. 251–266. DOI 10.1007/BF01840446.

  • J. W. Hunt and M. D. McIlroy (June 1976). "An Algorithm for Differential File Comparison". Report 41. Computing Science. Bell Laboratories.

  • Richard Hartman (1988-01-13). Unix diff(1) algorithm.
    23225@cca.CCA.COM. comp.unix.questions.

  • https://github.com/openbsd/src/blob/d1e24f318523607c98dc6fbe5a06a5d9e5c87293/usr.bin/diff/diffreg.c#L93

  • https://github.com/freebsd/freebsd/blob/40ec4fdc9a74bfdb83f13672acdb88af5c91ab46/contrib/diff/src/analyze.c#L23

  • Comprehensive review of diff algorithms, their history and implementations





share|improve this answer


















  • 1




    When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
    – schily
    Oct 1 at 10:50










  • Thanks for the references!
    – Kusalananda
    Oct 1 at 14:55












up vote
15
down vote



accepted







up vote
15
down vote



accepted






In GNU diff, also used on FreeBSD, the --minimal flag triggers an algorithm variation by Paul Eggert that causes it "to limit the cost to O(N**1.5 log N) at the price of producing suboptimal output for large inputs with differences". More specifically, it causes it to not apply several heuristics that deal in finding merely close to optimal solutions and in throwing out "confusing" lines as extra differences.



In OpenBSD diff, which uses the older Unix diff algorithm from the 1970s, the algorithm employed is credited to Harold Stone, and the --minimal flag triggers a search that is (effectively un-) bounded by the maximum value of an unsigned integer instead of by the square root of the size of the range of lines being compared (or 256 if it is greater).



Further reading



  • Eugene W. Myers (November 1986). "An O(ND) difference algorithm and its variations". Algorithmica. Volume 1. Issue 1–4. pp. 251–266. DOI 10.1007/BF01840446.

  • J. W. Hunt and M. D. McIlroy (June 1976). "An Algorithm for Differential File Comparison". Report 41. Computing Science. Bell Laboratories.

  • Richard Hartman (1988-01-13). Unix diff(1) algorithm.
    23225@cca.CCA.COM. comp.unix.questions.

  • https://github.com/openbsd/src/blob/d1e24f318523607c98dc6fbe5a06a5d9e5c87293/usr.bin/diff/diffreg.c#L93

  • https://github.com/freebsd/freebsd/blob/40ec4fdc9a74bfdb83f13672acdb88af5c91ab46/contrib/diff/src/analyze.c#L23

  • Comprehensive review of diff algorithms, their history and implementations





share|improve this answer














In GNU diff, also used on FreeBSD, the --minimal flag triggers an algorithm variation by Paul Eggert that causes it "to limit the cost to O(N**1.5 log N) at the price of producing suboptimal output for large inputs with differences". More specifically, it causes it to not apply several heuristics that deal in finding merely close to optimal solutions and in throwing out "confusing" lines as extra differences.



In OpenBSD diff, which uses the older Unix diff algorithm from the 1970s, the algorithm employed is credited to Harold Stone, and the --minimal flag triggers a search that is (effectively un-) bounded by the maximum value of an unsigned integer instead of by the square root of the size of the range of lines being compared (or 256 if it is greater).



Further reading



  • Eugene W. Myers (November 1986). "An O(ND) difference algorithm and its variations". Algorithmica. Volume 1. Issue 1–4. pp. 251–266. DOI 10.1007/BF01840446.

  • J. W. Hunt and M. D. McIlroy (June 1976). "An Algorithm for Differential File Comparison". Report 41. Computing Science. Bell Laboratories.

  • Richard Hartman (1988-01-13). Unix diff(1) algorithm.
    23225@cca.CCA.COM. comp.unix.questions.

  • https://github.com/openbsd/src/blob/d1e24f318523607c98dc6fbe5a06a5d9e5c87293/usr.bin/diff/diffreg.c#L93

  • https://github.com/freebsd/freebsd/blob/40ec4fdc9a74bfdb83f13672acdb88af5c91ab46/contrib/diff/src/analyze.c#L23

  • Comprehensive review of diff algorithms, their history and implementations






share|improve this answer














share|improve this answer



share|improve this answer








edited Oct 2 at 7:38

























answered Oct 1 at 10:41









JdeBP

30k462137




30k462137







  • 1




    When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
    – schily
    Oct 1 at 10:50










  • Thanks for the references!
    – Kusalananda
    Oct 1 at 14:55












  • 1




    When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
    – schily
    Oct 1 at 10:50










  • Thanks for the references!
    – Kusalananda
    Oct 1 at 14:55







1




1




When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
– schily
Oct 1 at 10:50




When I created a better diff from the UNIX sources, I checked that OpenBSD enhancement and could not find any better results. Note that the original stone() function uses: ` } while ((y = b[++j]) > 0);` and BTW: for normal file sizes, my enhanced UNIX diff is faster than GNU diff.
– schily
Oct 1 at 10:50












Thanks for the references!
– Kusalananda
Oct 1 at 14:55




Thanks for the references!
– Kusalananda
Oct 1 at 14:55

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472540%2fwhat-effect-does-the-d-option-have-with-diff%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?