Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: Is it File::Map issue, or another 'helpful' Perl regex optimization? (neither)

by Anonymous Monk
on Mar 18, 2017 at 01:55 UTC ( #1185103=note: print w/replies, xml ) Need Help??


in reply to Re^2: Is it File::Map issue, or another 'helpful' Perl regex optimization? (neither)
in thread Is it File::Map issue, or another 'helpful' Perl regex optimization?

:)

the module seems vague on claims and evidence, but ,

I just did some testing, and I get these numbers just loading a 51mb file I create

You'll need memusage-workingset-virtualmemory.pl to run it yourself

#!/usr/bin/perl -- use strict; use warnings; use Path::Tiny qw/ path /; use File::Map qw' map_file advise '; eval { do 'memusage-workingset-virtualmemory.pl'; 1 } or warn $@; sub memusage() { print mems($$), "\n"; } my $filename = 'goner-is-gone.txt'; my $fh = path($filename)->openw_raw; print $fh "01234567789abcdefghijklmnopqrstuvwxyz\n" for 1 .. 1_400_000 +; close $fh; print -s $filename, "\n"; memusage(); { map_file my $map, $filename, '+<'; memusage(); #~ advise( $map, 'sequential' ); ## no visible difference in pslis +t from normal #~ advise( $map, 'random' ); ## same memusage(); my @intervals = map { $_ * ( 1024 * 1024 ) } 1 .. 100; ## after every mb print + something my $lpos = 0; while ( $map =~ m{z}g ) { my $pos = pos($map); $lpos = $pos; next if $pos < $intervals[0]; shift @intervals; print 'pos ', $pos, "\n"; memusage(); } print "last pos $lpos\n"; memusage(); #~ <>; } memusage(); path($filename)->remove; __END__ 53200000 WVM: 21952 { WS: 6164 VM: 3344 } WVM: 73908 { WS: 6200 VM: 3444 } WVM: 73908 { WS: 6200 VM: 3444 } pos 1048609 WVM: 73908 { WS: 7236 VM: 3456 } pos 2097181 WVM: 73908 { WS: 8264 VM: 3456 } pos 3145753 WVM: 73908 { WS: 9292 VM: 3456 } pos 4194325 WVM: 73908 { WS: 10320 VM: 3456 } pos 5242897 WVM: 73908 { WS: 11344 VM: 3456 } pos 6291469 WVM: 73908 { WS: 12372 VM: 3456 } pos 7340041 WVM: 73908 { WS: 13400 VM: 3456 } pos 8388613 WVM: 73908 { WS: 14428 VM: 3456 } pos 9437185 WVM: 73908 { WS: 15452 VM: 3456 } pos 10485795 WVM: 73908 { WS: 16480 VM: 3456 } pos 11534367 WVM: 73908 { WS: 17508 VM: 3456 } pos 12582939 WVM: 73908 { WS: 18536 VM: 3456 } pos 13631511 WVM: 73908 { WS: 19560 VM: 3456 } pos 14680083 WVM: 73908 { WS: 20588 VM: 3456 } pos 15728655 WVM: 73908 { WS: 21616 VM: 3456 } pos 16777227 WVM: 73908 { WS: 22644 VM: 3456 } pos 17825799 WVM: 73908 { WS: 23668 VM: 3456 } pos 18874371 WVM: 73908 { WS: 24696 VM: 3456 } pos 19922981 WVM: 73908 { WS: 25724 VM: 3456 } pos 20971553 WVM: 73908 { WS: 26752 VM: 3456 } pos 22020125 WVM: 73908 { WS: 27776 VM: 3456 } pos 23068697 WVM: 73908 { WS: 28804 VM: 3456 } pos 24117269 WVM: 73908 { WS: 29832 VM: 3456 } pos 25165841 WVM: 73908 { WS: 30860 VM: 3456 } pos 26214413 WVM: 73908 { WS: 31884 VM: 3456 } pos 27262985 WVM: 73908 { WS: 32912 VM: 3456 } pos 28311557 WVM: 73908 { WS: 33940 VM: 3456 } pos 29360129 WVM: 73908 { WS: 34968 VM: 3456 } pos 30408739 WVM: 73908 { WS: 35992 VM: 3456 } pos 31457311 WVM: 73908 { WS: 37020 VM: 3456 } pos 32505883 WVM: 73908 { WS: 38048 VM: 3456 } pos 33554455 WVM: 73908 { WS: 39076 VM: 3456 } pos 34603027 WVM: 73908 { WS: 40100 VM: 3456 } pos 35651599 WVM: 73908 { WS: 41128 VM: 3456 } pos 36700171 WVM: 73908 { WS: 42156 VM: 3456 } pos 37748743 WVM: 73908 { WS: 43184 VM: 3456 } pos 38797315 WVM: 73908 { WS: 44208 VM: 3456 } pos 39845925 WVM: 73908 { WS: 45236 VM: 3456 } pos 40894497 WVM: 73908 { WS: 46268 VM: 3456 } pos 41943069 WVM: 73908 { WS: 47296 VM: 3456 } pos 42991641 WVM: 73908 { WS: 48320 VM: 3456 } pos 44040213 WVM: 73908 { WS: 49348 VM: 3456 } pos 45088785 WVM: 73908 { WS: 50376 VM: 3456 } pos 46137357 WVM: 73908 { WS: 51404 VM: 3456 } pos 47185929 WVM: 73908 { WS: 52428 VM: 3456 } pos 48234501 WVM: 73908 { WS: 53456 VM: 3456 } pos 49283073 WVM: 73908 { WS: 54484 VM: 3456 } pos 50331683 WVM: 73908 { WS: 55512 VM: 3456 } pos 51380255 WVM: 73908 { WS: 56536 VM: 3456 } pos 52428827 WVM: 73908 { WS: 57564 VM: 3456 } last pos 53199999 WVM: 73908 { WS: 58320 VM: 3456 } WVM: 21952 { WS: 6264 VM: 3356 }

So, when you map, seems to signal to the OS this is how big the memory usage is going to go (WVM field), and then the working set slowly increases up to the size of the file as the regular expression advances through the whole file "line" by line

Is this faster than something else? More memory efficien? I dunno

I'm beginning to suspect this is how File::Map is supposed to work

  • Comment on Re^3: Is it File::Map issue, or another 'helpful' Perl regex optimization? (neither)
  • Download Code

Replies are listed 'Best First'.
Re^4: Is it File::Map issue, or another 'helpful' Perl regex optimization? (neither)
by vr (Friar) on Mar 18, 2017 at 06:47 UTC

    Thank you, I'll digest this for a while

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1185103]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2018-01-22 22:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How did you see in the new year?










    Results (238 votes). Check out past polls.

    Notices?