The stupid question is the question not asked | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Hi All,
I was watching someone trying to find the sum of a particular column in a file. He was taking it a data-application (meant to work with large files) to calculate this sum. It takes a while to setup the app as you have to read the entire file and give it variable names etc. So i wrote this one liner which worked great (saved a lot of time).
I could avoid the "cut" but i didn't see a huge advantage (if someone can give me good reasons to avoid cut that will be nice!).... Please note that the file is pretty large (around 5 million rows and a few hundred columns)... Since it worked out well, that person asked me how to modify the code to make it work for 5 columns. I immediately used an array (return from a split /,/)and looped through the list to get the sum of the columns every time a new row is sent in! Little did I realize at the time of writing that this will have horrible performance... After letting it run for a few minutes I realized that looping many times (millions!) is not such a good idea (bad idea rather?)... Maybe just declaring 5 variables would have been better... So my question is how would Monks handle such a problem? Thanks all for your time! -SK In reply to Efficient way to sum columns in a file by sk
|
|