Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: combining 2 files with a comon field

by anotherstevew (Initiate)
on May 18, 2005 at 14:05 UTC ( #458213=note: print w/replies, xml ) Need Help??


in reply to combining 2 files with a comon field

<perl newbie puts head above parapet for first time - cautiously - to humbly offer an approach that doesn't use arrays (so it can handle BIG files) and ignores keys only present in one of the input files>
#!/usr/bin/perl -w use strict; open ONE, "1.txt" or die "Cannot open 1.txt to read\n $!"; open TWO, "2.txt" or die "Cannot open 2.txt to read\n $!"; open TRE, ">3.txt" or die "Cannot open 2.txt to write\n $!"; while (<ONE>) { chomp; (my $onea, my $oneb) = split(/\|/); my $twoa = undef; my $twob = undef; while (! eof(TWO)) { my $two = <TWO>; chomp $two; ($twoa, $twob) = split(/\|/, $two); last if ($twoa ge $onea); } if ($onea lt $twoa) { next; } else { print TRE "$onea\|$oneb\|$twob\|\n" if ($onea eq $twoa); } }

Replies are listed 'Best First'.
Re^2: combining 2 files with a comon field
by merzy (Scribe) on May 19, 2005 at 01:35 UTC
    Yea, I'm a big fan of this method. Memory-gentle, constant time, easy to understand. We use it a lot at work for files with tens of millions of lines. Note that it assumes sorted input files, but that's what sort(1) is for. :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://458213]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2022-05-23 11:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (82 votes). Check out past polls.

    Notices?