Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Compare 2 files and get data

by blazar (Canon)
on Sep 07, 2005 at 13:02 UTC ( [id://489855]=note: print w/replies, xml ) Need Help??


in reply to Re: Compare 2 files and get data
in thread Compare 2 files and get data

Thanks for all the tips and guide. I finally manage to work around the code and solved my problem. I am using a "IF" and "While" within a "While".
I'm glad you solved your problem. Incidentally, however, I'd like to point out that there's not such a thing as "IF" in Perl, nor "While". If you want to visually mark in a distinctive manner such keywords, you may put them between <c> or <code> tags. For example this: <c>split</c> is rendered like this: split. But for functions, you can also use [doc://split] which is rendered as a hyperlink like this: split
Below are my code and it works well. I am not too sure if by running in this flow will it take up resourses or not as it will compare 60,000 of records in the file.
Well, let's say that it doesn't seem a very smart way to do what you want. I think you should take another look at other suggestions that were given to you, e.g. in terms of using a hash, which seems most reasonable for such a task, instead. (If you didn't understand some of the replies you can ask for further clarification, of course.)

Basically you're re-opening and re-reading your second file across all the lines of the first one, and this makes your program IO intensive. However I will add a few further comments about your code as is.

Hope that this discussion and codes helps other who seek wisdon.
Of course it will, just as much as quite about every discussion here does...

First of all, and most importantly (although you may not see why it is, ATM -- but then please trust us!) more than one monk already recommended to put the following two lines at the top of your script:

use strict; use warnings;
You'll notice that with one or two exceptions even those who didn't tell you to do so, did include them in their own code examples.
open(file1,"file1.txt") || die ("cannot open file");
There's nothing strictly wrong with this. But it's better to use
  • "lexical filehandles",
  • the three-args form of open; also,
  • as a general rule you should use the high precedence (short circuiting) logical operators to operate on values and the low precedence ones for flow control; last,
  • it's recommendable to include in your error message a clue about what went wrong, thus put $! there.
Thus I would have written the above like this:
open my $file1, '<', "file1.txt" or die "Can't open `file1.txt': $!\n";
Notice that I also put a \n at the end of the die error message, because I prefer it like that, for this kind of errors (I don't think the final user is interested in the additional details that get printed if you omit it), though YMMV.
while (<file1>) { chop();
Nowadays no one ever uses chop to do this. They use chomp instead. Please check the documentation for both.
$REC = $_; @LINEREC = split(/\,/,$REC); $data1 = @LINEREC[0];
No need to copy $_ to $REC just to pass it to split. The former is even the implicit second arg to it, if none is given!!

No need to use a temporary array (why all those uppercase letters, BTW?) just to slice it, either. You can slice a list as well. Thus the above may have been simply

while (<$file1>) { chomp; my $data1 = (split /,/)[0]; # ...
Incidentally also note that it's not necessary to quote the comma in the regex, as it has not a special meaning.
open(file2,"file2.txt") || die ("cannot open file");
Hmmm, here your opening the same file over the outer cycle over and over again. But you explicitly close it only out of the outer cycle, at the end of your script along with file1 (which is not strictly necessary after all, since open filehandles get closed on program ext anyway).

Here you could either use a lexical handle as recommended above, which gets automatically closed on exiting the lexical scope it s defined in, or else you may just open it once at the top (but also then, use a lexical in any case!), at the same time as file1 and use seek to "roll it back".

Update: a possible rewrite of your code (same logic!) along the lines of the hints given above:

#!/usr/bin/perl -l use strict; use warnings; my ($fh1, $fh2) = map { open my $fh, '<', $_ or die "Can't open `$_': $!\n"; $fh } qw/file1.txt file2.txt/; while (<$fh1>) { chomp; my $data1 = (split /,/)[0]; seek $fh2, 0, 0; while (<$fh2>) { chomp; print $data1 if $data1 eq (split /,/)[1]; } } __END__
HTH

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://489855]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2024-03-19 05:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found