Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: write hash to disk after memory limit

by hailholyghost (Novice)
on Mar 13, 2015 at 13:30 UTC ( [id://1119959]=note: print w/replies, xml ) Need Help??


in reply to Re: write hash to disk after memory limit
in thread write hash to disk after memory limit

thanks a lot, I've been using hash-hash-array-array in order to keep memory use down. I think array access is also faster than hash, so I did this:
foreach my $rat (@directories) { print "Reading Merged_99$rat/bs_seeker-CG.tab ...\n"; open(FH,"<Merged_99$rat/bs_seeker-CG.tab") or die "cannot read M +erged_99$rat/bs_seeker-CG.tab: $!"; while (<FH>) { if (/M/) { next; } elsif ((/^chr(\S+)\s+(\d+)\s+\d+\s+(\d)\.(\d+)\s+(\d+)/) && + ($1 ~~ @CHROMOSOMES) && ($5 >= $MINIMUM_COVERAGE)) { #chromosome $1, methylated C $2, percent $3.$4 and coverage $5 $DATA{$1}{$2}[$set][$replicate] = "$3.$4"; } elsif ((/^chr(\S+)\s+(\d+)\s+\d+\s+(\d)\s+(\d+)/) && ($1 ~~ + @CHROMOSOMES) && ($4 >= $MINIMUM_COVERAGE)) { $DATA{$1}{$2}[$set][$replicate] = $3; } } close FH; $replicate++; }

Replies are listed 'Best First'.
Re^3: write hash to disk after memory limit
by LanX (Saint) on Mar 13, 2015 at 13:41 UTC
    As I said, better

    > > organize the upper tier roughly according to the timeline of your process

    No idea where $set comes from but $replicate could be such a top tier.

    so $data[$set][$replicate]{$1}{$2} should have far less memory swapping problems (AFAIS).

    (BTW better reserve uppercase var-names to perl buit-ins)

    If this structure doesn't fit into your future plans, you most likely want to use a DB anyway.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)

    PS: Je suis Charlie!

Re^3: write hash to disk after memory limit
by Jenda (Abbot) on Mar 14, 2015 at 00:59 UTC

    Do you later use the value as a string or as a number? If you use it as a number, I believe you could save quite a bit of memory by forcing a conversion before storing the data. The way you do it, you end up with a scalar containing both the string and (as soon as you use the number for the first time) the number.

    ... $DATA{$1}{$2}[$set][$replicate] = 0 + "$3.$4"; } elsif ((/^chr(\S+)\s+(\d+)\s+\d+\s+(\d)\s+(\d+)/) && ($1 ~~ + @CHROMOSOMES) && ($ +4 >= $MINIMUM_COVERAGE)) { $DATA{$1}{$2}[$set][$replicate] = 0 + $3; ...

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1119959]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-04-23 11:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found