Currently, if two consecutive paragraphs are edited, their contents are not properly diffed:
The current algorithm performs a pairwise diff whenever a deletion follows an addition or vice versa. This happens here:
However, in the example above we have two additions followed by two deletions. This leads to:
- an added paragraph:
this is one great paragraph
- a pairwise comparison:
this is one paragraphvs.
this is another
- a deleted paragraph:
here is yet another
The ideal approach would be to perform an alignment step before diffing (which is not trivial to implement, see Gale-Church algorithm), so a good compromise is to use one of the alternative edit sequences to the one returned by ONPDiff.
In other words,
ONPDiff.paragraph_diff transforms the edit sequence from
Add1 Add2 Delete1 Delete2 into this one
Add1 Delete1 Add2 Delete2.
Current behavior using
Proposed behavior using