http://www.perlmonks.org?node_id=1016585

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
I am trying to compare two text files with a bunch of links in each line, I would like to know what is not on the second file that is on the second file. Print the difference between the two files. I am using this code but It doesn't make sense on the results I’m getting, ma someone has a better way of doing this. Here is the code I am using:
a_file.txt sample: http://www.ok.com/test_www.pdf http://www.ok.com/test_1234.html http://www.ok.com/test_part.pdf http://www.ok.com/test_345.pdf http://www.ok.com/test_error.html
b_file.txt sample: http://www.ok.com/test_www.pdf http://www.ok.com/test_part.pdf http://www.ok.com/test_345.pdf
#!/usr/bin/perl use strict; use warnings; use Text::Diff; my $diffs = diff 'a_file.txt' => 'b_file.txt'; print $diffs;

Replies are listed 'Best First'.
Re: Compare two txt files and print the difference between the two files.
by Kenosis (Priest) on Feb 01, 2013 at 18:25 UTC

    If I'm understanding you correctly, you're interested in finding all items in the first file that are not in the second file. If this is the case, List::Compare can assist you with this:

    use strict; use warnings; use File::Slurp qw/read_file/; use List::Compare; chomp( my @a_file = read_file 'a_file.txt' ); chomp( my @b_file = read_file 'b_file.txt' ); my @a_file_only = List::Compare->new( \@a_file, \@b_file )->get_Lonly; print "$_\n" for @a_file_only;

    Output on your data sets:

    http://www.ok.com/test_1234.html http://www.ok.com/test_error.html

    Hope this helps!

Re: Compare two txt files and print the difference between the two files.
by trizen (Hermit) on Feb 01, 2013 at 17:06 UTC
    use strict; use warnings; my $first_file = shift || 'a.txt'; my $second_file = shift || 'b.txt'; open my $a_fh, '<', $first_file or die "$first_file: $!"; open my $b_fh, '<', $second_file or die "$second_file: $!"; my %second_file; @second_file{map { unpack 'A*', $_ } <$b_fh>} = (); while (<$a_fh>) { print unless exists $second_file{unpack 'A*', $_}; }
      Great script! Thanks
Re: Compare two txt files and print the difference between the two files.
by Lotus1 (Vicar) on Feb 01, 2013 at 18:41 UTC

    The code you presented in the OP worked fine for me. I assume the problem you had understanding the output was due to the differences in whitespace at the ends of the lines in your sample text files. The whitespace made some of the lines appear to be different. Once I removed the extra whitespace here is the output which makes sense to me:

    --- a_file.txt Fri Feb 1 13:30:39 2013 +++ b_file.txt Fri Feb 1 13:30:55 2013 @@ -1,6 +1,4 @@ -a_file.txt sample: +b_file.txt sample: http://www.ok.com/test_www.pdf -http://www.ok.com/test_1234.html http://www.ok.com/test_part.pdf http://www.ok.com/test_345.pdf -http://www.ok.com/test_error.html
Re: Compare two txt files and print the difference between the two files.
by 7stud (Deacon) on Feb 02, 2013 at 02:33 UTC
    use strict; use warnings; use 5.012; my $fname = 'a.txt'; open my $afile, "<", $fname or die "Couldn't open $fname: $!"; my %a_links; while (my $link = <$afile>) { chomp $link; $a_links{$link} = undef; } close $afile; $fname = 'b.txt'; open my $bfile, "<", $fname or die "Couldn't open $fname: $!"; while (my $link = <$bfile>) { chomp $link; next if exists $a_links{$link}; say $link; } close $bfile; --output:-- http://www.ok.com/test_www.pdf http://www.ok.com/test_345.pdf