Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re^3: incrementing already existing file

by roboticus (Chancellor)
on Feb 28, 2011 at 02:57 UTC ( #890498=note: print w/replies, xml ) Need Help??

in reply to Re^2: incrementing already existing file
in thread incrementing already existing file


OK, are both files sorted with respect to the key fields? If so, then you don't really want nested loops. You want a single loop and you can decide which file to read depending on what the current condition is. Something like:

# "Prime the pump" my $rec1 = <FILE1>; my $rec2 = <FILE2>; # Keep looping as long as either file has records while (!eof(FILE1) or !eof(FILE2)) { # Figure out what keys you have my $key1 = get_key_1($rec1); my $key2 = get_key_2($rec2); if ($key1 eq $key2) { # They're the same, so create an output record, and read # next record from file2 print build_record($rec1, $rec2); $rec2 = <FILE2>; } elsif ($key1 lt $key2) { # First file has a key we don't need, just ignore # it and read the next record $rec1 = <FILE1>; } else { # Hmmm ... first file seemed to skip the key we need. # print a partial record and advance to next file2 record print partial_record($rec2); $rec2 = <FILE2>; } }

Of course, if either of the files aren't sorted on the keys, then that won't work. You'll either have to sort them, or try something like a hash table. For the hash table, you simply read the first file into a hash based on the key field(s). Then you scan through the second file, looking up values from the hash as you need them. Something like:

# Read dictionary my %abbreviations; while (my $line = <DATA>) { my ($abbrev,$longname) = split/:/, $line; $abbreviations{$abbrev}=$longname; } # Process file open my $FH, '<', 'the_file' or die; while (my $line = <$FH>) { my ($field1, $field2, $key, $field3) = split /\t/, $line; if (exists $abbreviations{$key}) { # key was abbreviated, replace with full value $key = $abbreviations{$key}; } print "$key: ($field1, $field2, $field3)\n"; } close $FH; __DATA__ perl:pathologically eclectic rubbish lister lisp:lots of irritating silly parenthesis python:all your space are belong to us ruby:a quack language


When your only tool is a hammer, all problems look like your thumb.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://890498]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2018-12-12 05:26 GMT
Find Nodes?
    Voting Booth?
    How many stories does it take before you've heard them all?

    Results (57 votes). Check out past polls.

    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!