vr has asked for the wisdom of the Perl Monks concerning the following question:
I have a 50 Mb file:
perl -e "print 'x' x (50*1024*1024)" > xSuppose I slurp it and do some matching:
use strict; use warnings; my $s = do { local ( @ARGV, $/ ) = 'x'; <> }; $s =~ /x/;
$ /usr/bin/time -f %M perl fmap.pl
Maximum resident set size reported as 53596 kbytes. Fair enough. Then I learn about File::Map, and do this:
use strict; use warnings; use File::Map qw/ map_file /; map_file my $s, 'x', '<'; $s =~ /x/;
105844. Twice as much memory consumed. Actually, I'd expect, quoting POD,
loading the pages lazily on access. This means you only 'pay' for the parts of the file you actually use.
-- match consumes a single byte, hence only a "page" was loaded, no? Not the whole file. Otherwise, what's the point of example in synopsis? OK, maybe I'm wrong and Perl's regex engine wants a string in RAM, physically. But, if match was unsuccessful, e.g. $s =~ /y/; then -- 54676. Looks like a copy is made on each successful match:
$s =~ /x/; $s =~ /x/; $s =~ /x/; $s =~ /x/; $s =~ /x/;
Then: 310784.
But not in a loop: $s =~ /x/ for 1 .. 5; Then, again, 105848.
That's all rather weird. Same happens on Windows, too. (There was another issue, on Windows -- it suddenly refused to map a 'merely' 1 Gb file, and it appears that CreateFileMapping expects a continuous block in virtual memory of required size -- which can either happen or not even during the same day. Doesn't look as usable to me. But perhaps it's not Perl issue.)
I'm asking, because at first I was enthusiastic about this patch. Now I'm not so sure.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Is it File::Map issue, or another 'helpful' Perl regex optimization?
by dave_the_m (Monsignor) on Mar 18, 2017 at 09:06 UTC | |
Re: Is it File::Map issue, or another 'helpful' Perl regex optimization? (neither)
by Anonymous Monk on Mar 18, 2017 at 00:04 UTC | |
by vr (Curate) on Mar 18, 2017 at 00:15 UTC | |
by Anonymous Monk on Mar 18, 2017 at 01:55 UTC | |
by vr (Curate) on Mar 18, 2017 at 06:47 UTC |