Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Reducing memory footprint when doing a lookup of millions of coordinates

by GrandFather (Saint)
on Feb 27, 2011 at 10:29 UTC ( [id://890395]=note: print w/replies, xml ) Need Help??


in reply to Reducing memory footprint when doing a lookup of millions of coordinates

You could turn the problem inside out: load the test values into memory then scan the large reference file one line at a time to perform the matching:

#!/usr/bin/perl use strict; my $reps = <<REPS; chr1 100 120 feature1 chr1 200 250 feature2 chr2 150 200 feature1 chr2 280 350 feature1 chr3 100 150 feature2 chr3 300 450 feature2 REPS my %tests; while (my $line = <DATA>) { $line =~ s/[\n\r]//g; my @array = split /\s+/, $line; $tests{$array[0]}{$array[1]}{'end'} = $array[2]; $tests{$array[0]}{$array[1]}{'rep'} = $array[3]; } open my $repIn, '<', \$reps; while (<$repIn>) { my ($chr, $start, $end, $rep) = split ' '; next if !exists $tests{$chr}; for my $s (keys %{$tests{$chr}}) { if ($start <= $tests{$chr}{$s}{'end'}) { last if $s >= $end; print "$chr $start $end $rep\n"; } } } __DATA__ chr2 160 210
True laziness is hard work
  • Comment on Re: Reducing memory footprint when doing a lookup of millions of coordinates
  • Download Code

Replies are listed 'Best First'.
Re^2: Reducing memory footprint when doing a lookup of millions of coordinates
by richardwfrancis (Beadle) on Feb 27, 2011 at 12:18 UTC
    Hi GrandFather,

    You wont believe me but I thought about this after I posted but I haven't tested it yet! If the database idea proves problematic I think this is the way to go.

    Many thanks for your help and the code to help me out.

    Rich

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://890395]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (9)
As of 2024-04-23 18:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found