Re: ignore duplicates and show unique values between 2 text files

in reply to ignore duplicates and show unique values between 2 text files

Your issue appears to be that "'121'\n" and "'121'" are different strings. If you'd like to be newline insensitive (which would also address the extra newlines in your output), use chomp:

use strict;
use warnings;
my $f2 = 'cat_mapping_in_A.txt';
my $f1 = 'cat_mapping_in_B.txt';
my $outfile = '1.txt';
my %results = ();
open FILE1, "$f1" or die "Could not open file: $! \n";
while(my $line = <FILE1>){
   chomp $line;
   $results{$line}=1;
}
close(FILE1);
open FILE2, "$f2" or die "Could not open file: $! \n";
while(my $line =<FILE2>) {
   chomp $line;
   $results{$line}++;
}
close(FILE2);
open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \
+n";
foreach my $line (keys %results) {
   print OUTFILE "$line\n" if $results{$line} == 1;
}
close OUTFILE;
[download]

#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Comment on Re: ignore duplicates and show unique values between 2 text files Select or Download Code

Replies are listed 'Best First'.
Re^2: ignore duplicates and show unique values between 2 text files by perlnoobster (Sexton) on Apr 29, 2013 at 15:42 UTC
Hi kennethk, I am unsure on how to "reply to all" But can the script be modified to take account of two columns i.e FILE 1 `261293 'snow > equipment' 261293 'snow > equipment > boots' 261293 'snow > equipment > facemasks' 261293 'snow > equipment > goggles' 261293 'snow > equipment > helmets' 261293 'surf > accessories > books'` [download] FILE 2 `261293 'snow > equipment' 261293 'snow > equipment > boots' 261293 'snow > equipment > facemasks' 261293 'snow > equipment > goggles' 261293 'surf > accessories > books'` [download] OUTPUT `261293 'snow > equipment > helmets'` The two columns are separated by Tab, is this possible? Thank you	[reply] [d/l] [select]
Re^3: ignore duplicates and show unique values between 2 text files by kennethk (Abbot) on Apr 29, 2013 at 15:59 UTC
This is Perl; just about everything is "possible". However, I fail to see why the two column example is functionally different than a full line comparison. `"261293\t'snow > equipment > goggles'"` will equal `"261293\t'snow > equipment > goggles'"` just as much as the two substrings would. Are you dealing with a case where the numbers change and you need to be insensitive to that? Breaking the two columns apart can easily be achieved with code like `my @terms = split /\t/, $line;`. See split. #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply] [d/l] [select]
Re^4: ignore duplicates and show unique values between 2 text files by Laurent_R (Canon) on Apr 29, 2013 at 17:35 UTC
However, I fail to see why the two column example is functionally different than a full line comparison. `"261293\t'snow > equipment > goggles'"` will equal `"261293\t'snow > equipment > goggles'"` just as much as the two substrings would. Are you dealing with a case where the numbers change and you need to be insensitive to that? Yes, I totally agree, the same code should just work as well.	[reply] [d/l] [select]
Re^3: ignore duplicates and show unique values between 2 text files by LanX (Saint) on Apr 29, 2013 at 15:58 UTC
> is this possible? yes, but we won't post whole code! Apply `my ($number,$article) = split /\s+/, $line, 2` [download] for each input line and decide which part should be unique. learn to do it yourself with split. Cheers Rolf ( addicted to the Perl Programming Language) UPDATE added missing third parameter for split	[reply] [d/l]
Re^4: ignore duplicates and show unique values between 2 text files by kennethk (Abbot) on Apr 29, 2013 at 16:07 UTC
I think your posted code will not follow the posted spec. The posted lines contain additional whitespace, so `my ($number,$article) = split /\s+/, $line` will yield `$number = 261293 $article = 'snow` [download] as opposed to `split /\t/, $line`, which would yield `$number = 261293 $article = 'snow > equipment > helmets'` [download] Update: Parent code updated #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply] [d/l] [select]
Re^5: ignore duplicates and show unique values between 2 text files by LanX (Saint) on Apr 29, 2013 at 16:10 UTC
Re^6: ignore duplicates and show unique values between 2 text files by kennethk (Abbot) on Apr 29, 2013 at 16:18 UTC
Some notes below your chosen depth have not been shown here
Re^2: ignore duplicates and show unique values between 2 text files by perlnoobster (Sexton) on Apr 29, 2013 at 15:18 UTC
Thank you kennethk , it works perfectly	[reply]

In Section Seekers of Perl Wisdom

UPDATE