Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Reading HUGE file multiple times

by BrowserUk (Pope)
on Apr 28, 2013 at 01:06 UTC ( #1031022=note: print w/ replies, xml ) Need Help??


in reply to Reading HUGE file multiple times

Index the file in one pass; then use the index to seek the id/data directly:

#! perl -slw use strict; my %idx; ## Index the file $idx{ <> } = tell( ARGV ), scalar <> until eof(); for ( 1 .. 1000 ) { my $id = getNextId( ... ); seek ARGV, $idx{ $id }; scalar <>; # discard id line (or verify) print scalar <>; ## access data; }

Untested code for flavour only.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Reading HUGE file multiple times
Download Code
Re^2: Reading HUGE file multiple times
by Anonymous Monk on Apr 28, 2013 at 10:35 UTC
    thanks, will try it right away

      On my system, the code above indexed a 6.4 million record, 5GB file in 57 seconds.

      1367141700 1367141757 6348909

      Once indexed, accessing the records randomly runs at 1 second per thousand.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1031022]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2014-09-23 23:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (241 votes), past polls