As part of a text-to-html project I'm on, I need to display changes
in the text from one version to the next in bold. Happily, I'm
storing the text in CVS, so getting a diff is quite simple.
My problem is that the client is very specific: the bolded sections need to be the word, phrase,sentence, or paragraph that is different, but not more.
Omissions are unmarked (don't ask why).
the CVS diff identifies changed lines between drafts, but I need to pull changed words out of them. (Note that 'lines' in this case
are actually paragraphs)
My idea so far was to use Algorithm::Diff, which does element-by-element comparisons of two lists.
I can split the lines into lists of words, and run that through it. My trouble now is figuring out how to translate that into bolding.
This is not aided by the fact that at somep point I have to run the line through HTML::Entities::encode_entities(), which will move stuff around, and break any bolding put in by
Algorithm::Diff will give me output like:
[ [ '-', 0, 'a' ] ],
[ [ '+', 2, 'd' ] ],
[ [ '-', 4, 'h' ] ,
[ '+', 4, 'f' ] ],
[ [ '+', 6, 'k' ] ],
[ [ '-', 8, 'n' ],
[ '-', 9, 'p' ],
[ '+', 9, 'r' ],
[ '+', 10, 's' ],
[ '+', 11, 't' ],
Except that in my case, the letters will be words. Can anyone think of a relatively
elegant way to mark changed sections in <B></B>
tags, while still working with encode_entities
and not getting confused by punctuation?