Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comparing 2 files and creating third file with uncommon content

by rashichauhan (Novice)
on Jun 18, 2014 at 10:01 UTC ( #1090282=perlquestion: print w/ replies, xml ) Need Help??
rashichauhan has asked for the wisdom of the Perl Monks concerning the following question:

I want to compare 2 files and create third file with uncommon content.

e.g.

File 1
EC:1.1.1.42 isocitratedehydrogenase EC:1.1.1.44 6-phosphogluconatedehydrogenase EC:1.1.1.49 glucose-6-phosphate1-dehydrogenase
File 2
EC:1.1.1.42 isocitratedehydrogenase EC:1.1.1.44 6-phosphogluconatedehydrogenase EC:1.1.1.49 glucose-6-phosphate1-dehydrogenase EC:1.11.1.9 glutathioneperoxidase EC:2.5.1.16 spermidinesynthase EC:6.3.1.8 glutathionesynthase
Code that I have written
#!/usr/bin/perl $dir = "C:/Perl/bin/kegg"; open (FILE3,">>","$dir/keggdifference1.txt"); open (FILE4,">>","$dir/keggdifference2.txt"); open (FILE,"<","$dir/common.txt"); while (<FILE>) { chomp ($file=$_); @array =split (/\t/,"$file"); #print "$array[0]\n"; } open (FILE1,"<","$dir/bacillus1.txt"); while (<FILE1>) { chomp ($file1=$_); @array1 =split (/\t/,"$file1"); #print "$array1[3]\n"; } foreach ($array[0]) { foreach ($array1[3]) { if($array[0] eq $array1[3]) { print FILE3 "$array[0]\n"; } else { print FILE4 "$array[0]\n"; } } }
Code is not working.A new file is genertaed but consist of only last element of first file.Plz help me in this regard.

Comment on comparing 2 files and creating third file with uncommon content
Select or Download Code
Re: comparing 2 files and creating third file with uncommon content
by hippo (Curate) on Jun 18, 2014 at 10:10 UTC

    I think you may be missing something in your concept of arrays. Your while loops are overwriting them on each iteration which is probably not what you want. Your foreach loops are then looping over what is actually just a single value, again not what you want.

    To illustrate, you could dump the arrays after reading them in purely as a debugging exercise (ie. after the conclusion of the second while loop).

    I would very strongly recommend that you use strict and warnings also.

Re: comparing 2 files and creating third file with uncommon content
by Arunbear (Parson) on Jun 18, 2014 at 13:37 UTC
    If this is not just for the sake of learning to use Perl, this type of task can be done with well known command line tools e.g.
    diff -n file1 file2 | egrep -v '^[ad]' > file3
    will find the differences and write them to file3.

    You may need to install cygwin or native versions of these tools.

      Or just use comm(1).

Re: comparing 2 files and creating third file with uncommon content
by 2teez (Priest) on Jun 18, 2014 at 14:26 UTC

    Hi rashichauhan,

    Code is not working.A new file is genertaed but consist of only last element of first file.Plz help me in this regard.

    Use an hash and cut the chase on using for loops.
    Open the first file and read line by line, then on each line split on space, use those as your key/value of the hash.Then open the second file, step through it, one after the other, splitting on each line too. Then use the word to check if such exist in your hash, if so print it to the file you want and if not print it to another one.

    Something like this will do.

    #!/usr/bin/perl -l use warnings; use strict; use Inline::Files; my %file1; while (<FILE1>) { next if /^\s+$/; my ( $key, $value ) = split; $file1{$key} = $value; } my $str_union; my $str_inter; while (<FILE2>) { next if /^\s+$/; my ( $key, $value ) = split; if ( $file1{$key} ) { $str_union .= $_; } else { $str_inter .= $_; } } print "This goes to File 3\n", $str_union, "\nThis goes to File 4\n", $str_inter; __FILE1__ EC:1.1.1.42 isocitratedehydrogenase EC:1.1.1.44 6-phosphogluconatedehydrogenase EC:1.1.1.49 glucose-6-phosphate1-dehydrogenase __FILE2__ EC:1.1.1.42 isocitratedehydrogenase EC:1.1.1.44 6-phosphogluconatedehydrogenase EC:1.1.1.49 glucose-6-phosphate1-dehydrogenase EC:1.11.1.9 glutathioneperoxidase EC:2.5.1.16 spermidinesynthase EC:6.3.1.8 glutathionesynthase

    Note: Though this code works, but it just to show the concept am describing. On a 'good' day since the two while loops "smell" the same one can 'string' them together in a way to avoid DRY
    Hope this helps.
    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: comparing 2 files and creating third file with uncommon content
by Bloodnok (Vicar) on Jun 18, 2014 at 16:11 UTC
    Whilst it may have been left as an exercise for the interested reader, in the pursuit of an expeditious result and as ArunBear has already suggested, there are readily available command line tools (on *NIX & Cygwin) e.g. join - something like
    $ sort File_1 > File_1.sorted $ sort File_2 | join File_1.sorted - -v1 -v2 > file.diff

    A user level that continues to overstate my experience :-))
Re: comparing 2 files and creating third file with uncommon content
by pvaldes (Chaplain) on Jun 18, 2014 at 19:39 UTC
    And just for the purpose of completing the non-perl answers, you could use also other commands (i.e to compare between three files at same time use either $ diff3 in bash, or M-x ediff3 in emacs)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1090282]
Approved by 2teez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (10)
As of 2014-08-28 15:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (263 votes), past polls