Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^5: Memory Efficient Alternatives to Hash of Array

by neversaint (Deacon)
on Dec 28, 2008 at 00:34 UTC ( #732849=note: print w/replies, xml ) Need Help??


in reply to Re^4: Memory Efficient Alternatives to Hash of Array
in thread Memory Efficient Alternatives to Hash of Array

You are exactly right, tilly.

---
neversaint and everlastingly indebted.......
  • Comment on Re^5: Memory Efficient Alternatives to Hash of Array

Replies are listed 'Best First'.
Re^6: Memory Efficient Alternatives to Hash of Array
by BrowserUk (Pope) on Dec 28, 2008 at 01:05 UTC

    I stand corrected.

    However, you're still better off using an external sort, as it allows you to gather the multiple values for each key together without loading the entire dataset into memory. Using a fairly simple loop like this:

    #! perl use strict; my( $key, @array ) = split "\t", <>; while( <> ) { chomp; my( $newKey, $value ) = split "\t"; if( $newKey eq $key ) { push @array, $value; next; } else { # Process @array for $key #... ## Remember the newKey $key = $nextKey; ## And the reset the array @array = $value; } }

    And a command line like:

    sort < unsortedFile | perl theScriptAbove

    Or just sort the file and then feed it to the script as separate steps:

    sort < unsortedFile > sortedFile perl theScriptAbove sortedFile

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      *ahem* My first answer had the same idea, but with one less bug.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://732849]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2020-02-18 21:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (79 votes). Check out past polls.

    Notices?