performance of huge array looping

kommesel has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: performance of huge array looping by Abigail-II (Bishop) on Aug 14, 2002 at 11:16 UTC
As you describe it, the (memory) bottle neck is when you start processing the array. Shifting off the elements at each iteration will save memory - but the memory isn't released back to the OS - it just means that it takes longer before Perl needs to ask more memory from the OS. So, wether it helps to speed up the program depends on what else you do in the program - if you won't use much more memory, there's little gain. OTOH, shift is a fast operation, so there's not much reason to not do it. If however you are looking for a real performance boost, look into an algorithm that doesn't require the entire array to exist at one time. The fact you don't need an element anymore after you process it suggests you may not need all the elements before processing the first one either. Abigail	[reply]
Re: Re: performance of huge array looping by Anonymous Monk on Aug 14, 2002 at 14:13 UTC
I need three elements at a time.	[reply]
Re: Re: Re: performance of huge array looping by belden (Friar) on Aug 14, 2002 at 17:48 UTC
See here for a thread on reading lines 3-by-3 from a file. In particular, pay attention to tadman's response and merlyn's suggestion and link to a similar question. blyman setenv EXINIT 'set noai ts=2'	[reply]
Re: performance of huge array looping by Abigail-II (Bishop) on Aug 14, 2002 at 15:01 UTC
The your array doesn't have to have more than 3 elements at a time. Abigail	[reply]
Re: performance of huge array looping by RMGir (Prior) on Aug 14, 2002 at 11:58 UTC
If you're dealing with that much data, do you REALLY need to load it all into memory in an array? Given that you're only looking at each item once, it would make more sense to either process the data line-by-line from a text file (if it's that kind of data), or iterate over it record-by-record from a database using DBI. You program will run MUCH faster, as a bonus, since you're not going to be swapping. This is definitely a case where loading things in memory won't be an optimization. -- Mike	[reply]
Re: Re: performance of huge array looping by kommesel (Sexton) on Aug 15, 2002 at 08:26 UTC
The source is an ASCII file. I've thought of it, and basically all I need is to load lines which begin with \| and work on them. and their size is much smaller than the file. So I'll read line by line, push into array only the lines that begin with \| (by the way, I need to use m/^\*\\|/, don't I?) and loop over them, thus eliminating the need for swap.	[reply]
Re: Re: Re: performance of huge array looping by RMGir (Prior) on Aug 15, 2002 at 10:40 UTC
Makes sense... Yes, you would need to escape those characters in your regex. -- Mike	[reply]
Re: performance of huge array looping by demerphq (Chancellor) on Aug 14, 2002 at 11:23 UTC
Pop is probably more suitable than shift, but luckily for you a lhoward has written a definitive monograph on this subject. Imo however you will need to think of a different approach to what you listed here. HTH Yves / DeMerphq --- Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)	[reply]
Re: performance of huge array looping by PhiRatE (Monk) on Aug 14, 2002 at 22:19 UTC
Its this kind of comment which is fundamentally interesting because of what it doesn't say. The value or lack thereof in attempting to shift out elements of the array and thus slowly reduce your memory requirements is insignificant in comparison to the huge burning question of why you need a once-only array in the gigabyte range anyway. Even the (dubious) need to sort the entire array would probably be better handled by a chunked merge-sort writing clusters to disk rather than leaving it up to the kernel VM which, while good, cannot do the nice usage predictions you can do on your own paging system under those circumstances. In addition, there may be considerable statistical optimisations that can be done if more details of the nature of the data are known. So, lets be clear, there may be a modicum of performance to be gained by not shifting vs shifting, however vastly more performance is likely to be gained by posting a better description of the problem here so the algorithm junkies can tell you how to do it without chewing gigabytes of ram.	[reply]
Re: performance of huge array looping by Anonymous Monk on Aug 14, 2002 at 11:49 UTC
I am not it will work for you but works for me I change the array into hash and use DBI to save it as a file open it when I want and close it when I am done - save memory your attention of looping is to find an element so, use your value as the key for the hash the search will be quick	[reply]


more useful options
	PerlMonks