Setting maximum hunk size in Algorithm::Diff

perlancar has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way in the current Algorithm::Diff to limit the hunk size? If there isn't, would it be a good idea to add this ability to the module? One example where I'm wanting it is: I have ~20 lines of text in file1, and the same 20 lines of text in file2 but each with some characters modified. Normally, with the 'diff -u' command and Algorithm::Diff, I'll just have two hunks: remove 20 lines from file1, followed by adding 20 lines from file2. What I want is a unified diff view but one a line-by-line basis, plus word-/character-based color highlighting so I can see for each line which characters are modified.

Comment on Setting maximum hunk size in Algorithm::Diff

Replies are listed 'Best First'.
Re: Setting maximum hunk size in Algorithm::Diff by tybalt89 (Monsignor) on Jan 13, 2018 at 12:14 UTC
Just diff by character. Something like this ? #!/usr/bin/perl # http://perlmonks.org/?node_id=1207184 use strict; use warnings; use Algorithm::Diff qw(traverse_sequences); use Term::ANSIColor; my $file1 = <<'END'; #!/usr/bin/perl use Algorithm::Diff qw(traverse_sequences); use Term::ANSIColor; use strict; use warnings; # line in both files # line only in file1 my @from = split //, shift // 'this is the left string'; my @to = split //, shift // 'this is the right string'; traverse_sequences( \@from, \@to, { MATCH => sub {print $from[shift()]}, DISCARD_A => sub {print color('red'), $from[shift()], color 'reset'} +, DISCARD_B => sub {print color('green'), $to[pop()], color 'reset'}, } ); print "\n"; END my $file2 = <<'END'; #!/usr/bin/perl use Algorithm::Diff qw(traverse_sequences); use Term::ANSIColor; use warnings; use strict; # line in both files my @from = split //, shift // 'this is the source string'; my @to = split //, shift // 'this is the target string'; # line only in file2 traverse_sequences( \@from, \@to, { MATCH => sub {print $from[shift()]}, DISCARD_A => sub {print color('darkred'), $from[shift()], color 'res +et'}, DISCARD_B => sub {print color('cyan'), $to[pop()], color 'reset'}, } ); print "\n"; END my @from = split //, $file1; my @to = split //, $file2; traverse_sequences( \@from, \@to, { MATCH => sub {print $from[shift()]}, DISCARD_A => sub {print color('red'), $from[shift()], color 'reset'} +, DISCARD_B => sub {print color('green'), $to[pop()], color 'reset'}, } ); print "\n"; [download] Output not shown because it contains ANSI color sequences. Run in a terminal (like linux console or xterm) that can handle them.	[reply] [d/l]

Replies are listed 'Best First'.

Re: Setting maximum hunk size in Algorithm::Diff
by tybalt89 (Monsignor) on Jan 13, 2018 at 12:14 UTC

Just diff by character. Something like this ?

#!/usr/bin/perl

# http://perlmonks.org/?node_id=1207184

use strict;
use warnings;
use Algorithm::Diff qw(traverse_sequences);
use Term::ANSIColor;

my $file1 = <<'END';
#!/usr/bin/perl

use Algorithm::Diff qw(traverse_sequences);
use Term::ANSIColor;
use strict;
use warnings;

# line in both files

# line only in file1

my @from = split //, shift // 'this is the left string';
my @to = split //, shift // 'this is the right string';

traverse_sequences( \@from, \@to,
  {
  MATCH     => sub {print $from[shift()]},
  DISCARD_A => sub {print color('red'), $from[shift()], color 'reset'}
+,
  DISCARD_B => sub {print color('green'), $to[pop()], color 'reset'},
  } );
print "\n";
END

my $file2 = <<'END';
#!/usr/bin/perl

use Algorithm::Diff qw(traverse_sequences);
use Term::ANSIColor;
use warnings;
use strict;

# line in both files

my @from = split //, shift // 'this is the source string';
my @to = split //, shift // 'this is the target string';

# line only in file2

traverse_sequences( \@from, \@to,
  {
  MATCH     => sub {print $from[shift()]},
  DISCARD_A => sub {print color('darkred'), $from[shift()], color 'res
+et'},
  DISCARD_B => sub {print color('cyan'), $to[pop()], color 'reset'},
  } );
print "\n";
END

my @from = split //, $file1;
my @to = split //, $file2;

traverse_sequences( \@from, \@to,
  {
  MATCH     => sub {print $from[shift()]},
  DISCARD_A => sub {print color('red'), $from[shift()], color 'reset'}
+,
  DISCARD_B => sub {print color('green'), $to[pop()], color 'reset'},
  } );
print "\n";
[download]

Output not shown because it contains ANSI color sequences. Run in a terminal (like linux console or xterm) that can handle them.

[reply]
[d/l]

Back to Seekers of Perl Wisdom