Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Efficient use of memory

by ivancho (Hermit)
on Jun 04, 2005 at 07:07 UTC ( [id://463489]=note: print w/replies, xml ) Need Help??


in reply to Efficient use of memory

your data structures confuse me.. sorry, it's a late evening..
Storing everything in memory seems bad, if you only need mean, variance and such. Why don't you check Statistics::Descriptive. It allows you to save things sparsely ( ie, only their main statistical properties, rather than all the datapoints.. ).
update:Of course, it does that by doing arithmetic stuff on each add, but this should be negligible compared to the memory savings.. - it might slow you down, if you had 100 million rows, but think of what those would do to your memory..

Thus, when you parse your "Year" line, you know what variables you want the stats for - create one Statistics.. object for each, and from there on just add datapoints from each split.. also, I'd rather use an array - something like

#!/usr/bin/perl -lw use strict; use Statistics::Descriptive; my $filename='224_APID003_report.csv'; open (READ_IN,"<$filename") or die "I can't open $filename to read.\n" +; my @idxs; my %vars; my @names; while (<READ_IN>) { chomp; /^Year/ && do { @names = split /,/; @idxs = grep {$names[$_] !~ /TIME|YEAR/i } (0..@names-1); $vars{$_} = Statistics::Descriptive::Sparse->new() for @names[ +@idxs]; }; /^\d{4}/ && do { my @values = split /,/; $vars{$names[$_]}->add_data($values[$_]) for @idxs; }; } close READ_IN or die $!; foreach (keys %vars) { printf "%20s: mean = %10.4f, var = %10.4f\n",$_, $vars{$_}->mean() +, $vars{$_}->variance(); }

btw, this is not tested, I might be writing rubbish...

Update: Tested, corrected, prettyfied, added all the details. I hope it works for you. The other Sparse methods of Stats::Desc seem to cover everything you need..

update2: I am aware that the excessive use of $_ throughout this piece of code makes it more difficult to read than expanding all the loops. On the other hand, I think I'm way too attached to grep, map and inverse for, with their terseness... I might eventually write a meditation about Perl and Bulgarian language ..

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://463489]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-18 01:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found