Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: speeding up parsing, jump to line

by graff (Chancellor)
on Aug 15, 2014 at 13:19 UTC ( [id://1097553]=note: print w/replies, xml ) Need Help??


in reply to speeding up parsing, jump to line

The answers to dave_the_m's questions will determine what sort of solution to look for. E.g. if the $data file is relatively small, you can load it into memory, then read each *.score.txt file exactly once to load up relevant info for each item in $data, then do stats on the info - something like this:
use strict; use warnings; my $data = "some/file.name"; open( F, $data ) or die "$data:$!\n"; my %targets; while(<F>) { next if ( /Header/ ); chomp; my ( $chr, $start, $end ) = ( split( /\t/ ))[0,2,3]; push @{$targets{$chr}}, { table => $_, start => $start, end => $en +d }; } for my $chr ( keys %targets ) { open( R, "$chr.score.txt" ) or die "$chr.score.txt: $!\n"; while (<R>) { chomp; my ( $pos, $score ) = ( split( /\t/ ))[1,2]; for my $range ( @{$targets{$chr}} ) { if ( $pos >= $$range{start} and $pos <= $$range{end} ) { push @{$$range{scores}}, $score; } } } } # do statistics on contents of %targets…
If $data contains too much stuff, and/or requires too much stuff from the *.score.txt files to be held in memory for each $chr, then maybe you have to create output files for each $chr (or each start-end range in $data), so that it'll be quick/easy to do stats on those files.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1097553]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-03-29 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found