Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: ignore duplicates and show unique values between 2 text files

by kennethk (Monsignor)
on Apr 29, 2013 at 15:05 UTC ( #1031229=note: print w/ replies, xml ) Need Help??


in reply to ignore duplicates and show unique values between 2 text files

Your issue appears to be that "'121'\n" and "'121'" are different strings. If you'd like to be newline insensitive (which would also address the extra newlines in your output), use chomp:

use strict; use warnings; my $f2 = 'cat_mapping_in_A.txt'; my $f1 = 'cat_mapping_in_B.txt'; my $outfile = '1.txt'; my %results = (); open FILE1, "$f1" or die "Could not open file: $! \n"; while(my $line = <FILE1>){ chomp $line; $results{$line}=1; } close(FILE1); open FILE2, "$f2" or die "Could not open file: $! \n"; while(my $line =<FILE2>) { chomp $line; $results{$line}++; } close(FILE2); open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \ +n"; foreach my $line (keys %results) { print OUTFILE "$line\n" if $results{$line} == 1; } close OUTFILE;

#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.


Comment on Re: ignore duplicates and show unique values between 2 text files
Select or Download Code
Re^2: ignore duplicates and show unique values between 2 text files
by perlnoobster (Sexton) on Apr 29, 2013 at 15:18 UTC
    Thank you kennethk , it works perfectly
Re^2: ignore duplicates and show unique values between 2 text files
by perlnoobster (Sexton) on Apr 29, 2013 at 15:42 UTC
    Hi kennethk, I am unsure on how to "reply to all" But can the script be modified to take account of two columns i.e

    FILE 1

    261293 'snow > equipment' 261293 'snow > equipment > boots' 261293 'snow > equipment > facemasks' 261293 'snow > equipment > goggles' 261293 'snow > equipment > helmets' 261293 'surf > accessories > books'

    FILE 2

    261293 'snow > equipment' 261293 'snow > equipment > boots' 261293 'snow > equipment > facemasks' 261293 'snow > equipment > goggles' 261293 'surf > accessories > books'

    OUTPUT

    261293    'snow > equipment > helmets'

    The two columns are separated by Tab, is this possible?

    Thank you
      > is this possible?

      yes, but we won't post whole code!

      Apply

      my ($number,$article) = split /\s+/, $line, 2

      for each input line and decide which part should be unique.

      learn to do it yourself with split.

      Cheers Rolf

      ( addicted to the Perl Programming Language)

      UPDATE

      added missing third parameter for split

        I think your posted code will not follow the posted spec. The posted lines contain additional whitespace, so my ($number,$article) = split /\s+/, $line will yield
        $number = 261293 $article = 'snow
        as opposed to split /\t/, $line, which would yield
        $number = 261293 $article = 'snow > equipment > helmets'
        Update: Parent code updated

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      This is Perl; just about everything is "possible". However, I fail to see why the two column example is functionally different than a full line comparison. "261293\t'snow > equipment > goggles'" will equal "261293\t'snow > equipment > goggles'" just as much as the two substrings would. Are you dealing with a case where the numbers change and you need to be insensitive to that?

      Breaking the two columns apart can easily be achieved with code like my @terms = split /\t/, $line;. See split.


      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        However, I fail to see why the two column example is functionally different than a full line comparison. "261293\t'snow > equipment > goggles'" will equal "261293\t'snow > equipment > goggles'" just as much as the two substrings would. Are you dealing with a case where the numbers change and you need to be insensitive to that?

        Yes, I totally agree, the same code should just work as well.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1031229]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (9)
As of 2014-12-18 05:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (42 votes), past polls