Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Comparing two text files

by perlnoobster (Sexton)
on Jan 07, 2013 at 17:25 UTC ( [id://1012068]=perlquestion: print w/replies, xml ) Need Help??

perlnoobster has asked for the wisdom of the Perl Monks concerning the following question:

Hi perl monks, I am fairly new to perl and have given an attempt to compare two text files, text file A contains several columns containing numbers, text file B contains a column of numbers that I need to match. the 3rd column from text file A contains a unique value ($textpac3) and that unique value may be in text file B ($text1[0]). If both unique values appear in both files and a number from column 11 ($textpac11) matches the $text11 value from text file B then the results will print, however it doesnt seem to print correctly to the matches.txt file, please can someone help me?
use warnings; use strict; my $A_file = 'A.txt'; my $B = 'B.txt'; my $matches ='matches.txt'; open (INPUT, $A_file) or die "ERROR: cannot find file $A_file\n"; open (OUT2, ">$matches"); while (<INPUT>) { my @text1=(); @text1 = split /\t/,$_; chomp @text1; next if ($text1[0]=~ /L1/gi); open (OUT1, "$B"); while(<OUT1>) { my @textpac=(); @textpac = split /\t/,$_; chomp @textpac; next if ($textpac[0]=~ /OID/gi); if($textpac[3] eq $text1[0] && $textpac[11] eq $text1[11]) + { print OUT2 join( "\t", @text1[ 1, 2, 3 ] ), "\n"; } } } close ($A_file); close OUT1;

Replies are listed 'Best First'.
Re: Comparing two text files
by davido (Cardinal) on Jan 07, 2013 at 17:40 UTC

    Two things to look at: First, "open (OUT2, ">$matches");" should have an "or die $!" appended to it so that you know for certain whether or not you're even successfully opening the file for output. Second, temporarily change "print OUT2 join(...." to just "print join(..." and visually inspect that the output to your screen is what you think it should be. That in and of itself is usually enlightening enough to help you to isolate the issue.


    Dave

Re: Comparing two text files
by CountZero (Bishop) on Jan 07, 2013 at 20:56 UTC
    Perl arrays are zero-based. When you split to @arraythe third column will actually end up in $array[2]. So you may want to review your indexes in the if test to see if you are matching the right columns.

    Also, contrary to your explanation, the fields of A.txt actually end up in @text and the fields of B.txt end up in @textpac.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Comparing two text files
by space_monk (Chaplain) on Jan 08, 2013 at 11:51 UTC

    Try something like this, which includes the suggestions by davido and Anonymous_Monk. As CountZero has suggested, you may also need to check some or all of the index values as arrays start at element zero in Perl ... :-)

    use strict; use warnings; open my $bh, "<", $B_file or die "$0: open $B_file: $!"; my %bHash; # slurp text file B # could generate a unique key consisting of field 0 and 11, # but never mind. foreach (<$bh>) { my @text=split(/\t/); $bHash{$text[0]} = $text[11]; } close $bh; open my $ah, "<", $A_file or die "$0: open $A_file: $!"; # read in text file A foreach (<$ah>) { my @text = split(/\t/); if ($bHash{$text[3]} eq $text[11]) { # we have a match, print it print join( "\t", @text[1..3]),"\n"; } } close $ah;
    A Monk aims to give answers to those who have none, and to learn from those who know more.
      Thank you all for your advice! i've taken it aboard and it works great! I have one last question though, the value that could match the $text11 of file A can be found in several other columns such as 12,13,14,15 How can I adjust the original hash: <code>$bHash{$text[0]} = $text11; so that it includes text 12....15?<\code> is it possible to implement that? if so how? Thank you ! :) This doesnt work: $bHash{$text[0]} = $text11|$text12|$text13;
Re: Comparing two text files
by Anonymous Monk on Jan 08, 2013 at 03:46 UTC

    Hi,

    I would start by reading the contents of file B into a hash to use as a look up for the values you want to check in file A.

    That should simplify things a bit.

    J.C.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1012068]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2024-04-19 18:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found