Your skill will accomplish
what the force of many cannot
Is it File::Map issue, or another 'helpful' Perl regex optimization?by vr (Monk)
|on Mar 17, 2017 at 23:52 UTC||Need Help??|
vr has asked for the
wisdom of the Perl Monks concerning the following question:
I have a 50 Mb file:perl -e "print 'x' x (50*1024*1024)" > x
Suppose I slurp it and do some matching:
Maximum resident set size reported as 53596 kbytes. Fair enough. Then I learn about File::Map, and do this:
105844. Twice as much memory consumed. Actually, I'd expect, quoting POD,
loading the pages lazily on access. This means you only 'pay' for the parts of the file you actually use.
-- match consumes a single byte, hence only a "page" was loaded, no? Not the whole file. Otherwise, what's the point of example in synopsis? OK, maybe I'm wrong and Perl's regex engine wants a string in RAM, physically. But, if match was unsuccessful, e.g. $s =~ /y/; then -- 54676. Looks like a copy is made on each successful match:
But not in a loop: $s =~ /x/ for 1 .. 5; Then, again, 105848.
That's all rather weird. Same happens on Windows, too. (There was another issue, on Windows -- it suddenly refused to map a 'merely' 1 Gb file, and it appears that CreateFileMapping expects a continuous block in virtual memory of required size -- which can either happen or not even during the same day. Doesn't look as usable to me. But perhaps it's not Perl issue.)
I'm asking, because at first I was enthusiastic about this patch. Now I'm not so sure.