Re: performance of huge array looping
by Abigail-II (Bishop) on Aug 14, 2002 at 11:16 UTC
|
As you describe it, the (memory) bottle neck is when you
start processing the array. Shifting off the elements at
each iteration will save memory - but the memory isn't
released back to the OS - it just means that it takes longer
before Perl needs to ask more memory from the OS. So, wether
it helps to speed up the program depends on what else you do
in the program - if you won't use much more memory, there's
little gain. OTOH, shift is a fast operation, so there's not
much reason to not do it.
If however you are looking for a real performance boost,
look into an algorithm that doesn't require the entire array
to exist at one time. The fact you don't need an element anymore
after you process it suggests you may not need all the elements
before processing the first one either.
Abigail | [reply] |
|
I need three elements at a time.
| [reply] |
|
| [reply] |
|
The your array doesn't have to have more than 3 elements at a time.
Abigail
| [reply] |
Re: performance of huge array looping
by RMGir (Prior) on Aug 14, 2002 at 11:58 UTC
|
If you're dealing with that much data, do you REALLY need to load it all into memory in an array?
Given that you're only looking at each item once, it would make more sense to either process the data line-by-line from a text file (if it's that kind of data), or iterate over it record-by-record from a database using DBI.
You program will run MUCH faster, as a bonus, since you're not going to be swapping. This is definitely a case where loading things in memory won't be an optimization.
--
Mike | [reply] |
|
The source is an ASCII file.
I've thought of it, and basically all I need is to load
lines which begin with *| and work on them. and
their size is much smaller than the file.
So I'll read line by line, push into array only the lines
that begin with *| (by the way, I need to use m/^\*\|/, don't I?)
and loop over them, thus eliminating the need for swap.
| [reply] |
|
| [reply] |
Re: performance of huge array looping
by demerphq (Chancellor) on Aug 14, 2002 at 11:23 UTC
|
Pop is probably more suitable than shift, but luckily for you a lhoward has written a definitive monograph on this subject.
Imo however you will need to think of a different approach to what you listed here.
HTH
Yves / DeMerphq
---
Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)
| [reply] |
Re: performance of huge array looping
by PhiRatE (Monk) on Aug 14, 2002 at 22:19 UTC
|
Its this kind of comment which is fundamentally interesting because of what it *doesn't* say. The value or lack thereof in attempting to shift out elements of the array and thus slowly reduce your memory requirements is insignificant in comparison to the huge burning question of why you need a once-only array in the gigabyte range anyway.
Even the (dubious) need to sort the entire array would probably be better handled by a chunked merge-sort writing clusters to disk rather than leaving it up to the kernel VM which, while good, cannot do the nice usage predictions you can do on your own paging system under those circumstances.
In addition, there may be considerable statistical optimisations that can be done if more details of the nature of the data are known.
So, lets be clear, there may be a modicum of performance to be gained by not shifting vs shifting, however vastly more performance is likely to be gained by posting a better description of the problem here so the algorithm junkies can tell you how to do it without chewing gigabytes of ram.
| [reply] |
Re: performance of huge array looping
by Anonymous Monk on Aug 14, 2002 at 11:49 UTC
|
I am not it will work for you but works for me
I change the array into hash
and use DBI to save it as a file
open it when I want and close it when I am done - save memory
your attention of looping is to find an element
so, use your value as the key for the hash
the search will be quick
| [reply] |