in reply to
Re^3: How to deal with Huge data
in thread How to deal with Huge data
That's just noise. Even if Perl doesn't keep around an AV internally to avoid the cost of reallocating a variable (and I believe there's an optimization which does exactly that), look at all of the other, more expensive, work in that snippet:
- Reading a line from a file. Here's the biggest time sink: doing system calls, seek times, transferring data across multiple busses, checking for cache hits and paying for cache misses, running through any IO layers....
- Doing an unanchored regular expression with a character class; that means examining every character in the string and allocating and building an entirely new string--and just try to guess beforehand how long that new string needs to be.
- Creating new SVs for every tab-separated element in the line.
You have to do a tremendous amount of optimization before hoisting your variable declaration out of the loop makes any measure difference, and that's if Perl doesn't do that optimization already. Besides that, changing the memory layout of your program probably has a bigger effect on performance, if you take I/O out of the picture. What if you create an extra page fault per loop by needing an extra page? What if you fragment memory more this way? How do you even measure this in a meaningful way?
Thus I say it's a silly pseudo-optimization.