Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Merge huge files (individually sorted) by order

by tanger007 (Initiate)
on Jul 19, 2013 at 00:16 UTC ( #1045231=note: print w/replies, xml ) Need Help??


in reply to Re: Merge huge files (individually sorted) by order
in thread Merge huge files (individually sorted) by order

Works so well I felt more stupid :) A follow up question: if you have a big file (>10GB) in which one column has say 100 unique values. How do you break this file into 100 smaller files with one unique value in that column? Thanks so much.
  • Comment on Re^2: Merge huge files (individually sorted) by order

Replies are listed 'Best First'.
Re^3: Merge huge files (individually sorted) by order
by roboticus (Chancellor) on Jul 19, 2013 at 01:02 UTC

    tanger007:

    Try something like putting a file handle for each column value in a hash, and then looking up the file handle on demand:

    my %OFH; my $OFH; while (<$IFH>) { my @fields = split /\t/,$_; $OFH = $OFH{$fields[$key_column]}; if (! defined $OFH) { # We don't have this value yet, so open another file open $OFH, '>', 'key_value.' . $fields[$key_column]; $OFH{$fields[$key_column]} = $OFH; } print $OFH join("\t",@fields); }

    Note: It's rough, untested and needs some error handling and such. But the basic concept should work fine for you.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1045231]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2016-12-10 07:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:













    Results (160 votes). Check out past polls.