Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Strawberry: Both IO and global match are VERY SLOW. Unless pre-heated (but why?)

by Anonymous Monk
on Nov 13, 2023 at 12:36 UTC ( [id://11155604]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I don't think this is related to my previous Substitution unexpectedly very slow, in Strawberry, which GH ticket is marked as fixed. I can't check this new issue with 5.39. But if the fix also cures what's described below -- then fine. + Preliminary check with online Perl doesn't show anomalies, so looks like Linux is OK, it's Windows issue.

The recent File::Slurp discussion made me have a look at how it behaves in list context. In particular, a look at regex it uses to split, at line #133. I wondered if it's really the best method, then ran some tests. Accidentally, in the process I stumbled upon really weird results. As frustrating as it is, but close to unsellability, kind of.

I did "physical file IO", not "scalar IO", to better demonstrate how it can affect (and always did?) the very basic daily operations. "Preheat" immediately disposes of allocated array, it's not well-known "pre-set $#array or hash size for better performance".

use strict; use warnings; use feature 'say'; use Time::HiRes 'time'; if ( !@ARGV ) { say $^V; for my $size ( 5e5, 1e6, 5e6 ) { say "Array size: $size"; for my $method ( 0, 1, 2 ) { for my $heat ( 0, 1 ) { system $^X, $0, $method, $heat, $size } } } } else { my ( $method, $preheat, $size ) = @ARGV; my $s = join ' ', 0 .. 9; $s .= "\n"; $s x= $size; chomp $s; my $t = time; if ( $preheat ) { my @garbage = ( undef ) x $size } my @a; if ( $method == 0 ) { # split @a = split /(?<=\n)/, $s } elsif ( $method == 1 ) { # global match @a = $s =~ /(.*?\n|.+)/gs } elsif ( $method == 2 ) { # IO (list context) open my $fh, '>', 'garbage.tmp'; binmode $fh; print $fh $s; close $fh; open $fh, '<', 'garbage.tmp'; binmode $fh; @a = <$fh>; close $fh; } printf "\t%s, %s:\t%.3f\n", ( <split match io(list)> )[ $method ], ( $preheat ? 'pre-heat' : 'no pre-heat' ), time - $t }

Result:

v5.38.0 Array size: 500000 split, no pre-heat: 0.627 split, pre-heat: 0.631 match, no pre-heat: 0.855 match, pre-heat: 0.292 io(list), no pre-heat: 0.893 io(list), pre-heat: 0.274 Array size: 1000000 split, no pre-heat: 1.272 split, pre-heat: 1.286 match, no pre-heat: 3.604 match, pre-heat: 0.583 io(list), no pre-heat: 3.498 io(list), pre-heat: 0.556 Array size: 5000000 split, no pre-heat: 6.356 split, pre-heat: 6.346 match, no pre-heat: 79.586 match, pre-heat: 2.885 io(list), no pre-heat: 84.150 io(list), pre-heat: 2.744

Replies are listed 'Best First'.
Re: Strawberry: Both IO and global match are VERY SLOW. Unless pre-heated (but why?)
by syphilis (Archbishop) on Nov 14, 2023 at 00:38 UTC
    ... it's Windows issue

    Yes - and I think it's worth raising at https://github.com/Perl/perl5/issues.
    I'm getting similar results on Windows 11, with both perl-5.38.0 and the latest devel release (perl-5.39.4).
    Update: If this is the same issue as https://github.com/Perl/perl5/issues/21360, then it has not been fixed.

    My system seems to be roughly twice as fast as yours. For the final block I get:
    Array size: 5000000 split, no pre-heat: 3.296 split, pre-heat: 3.421 match, no pre-heat: 38.257 match, pre-heat: 1.578 io(list), no pre-heat: 38.547 io(list), pre-heat: 1.280
    With cygwin's perl-5.38.0 on this same Windows 11 system, I see:
    Array size: 5000000 split, no pre-heat: 3.280 split, pre-heat: 3.265 match, no pre-heat: 6.952 match, pre-heat: 1.734 io(list), no pre-heat: 5.920 io(list), pre-heat: 0.969
    So there's still a 6x slowdown for "match" and "io(list)" - but that's nowhere near as bad as the native windows perl.
    I think that demonstrates there's a potential for significant improvement in the performance of the native windows builds.

    With perl-5.38.0 on freebsd12, I see no timing disparity:
    Array size: 5000000 split, no pre-heat: 10.392 split, pre-heat: 10.533 match, no pre-heat: 3.642 match, pre-heat: 3.593 io(list), no pre-heat: 3.166 io(list), pre-heat: 3.081
    Cheers,
    Rob
Re: Strawberry: Both IO and global match are VERY SLOW. Unless pre-heated (but why?)
by Discipulus (Canon) on Nov 21, 2023 at 12:39 UTC
    Hello,

    not being able to dig it up further I've submitted an issue to perl5 and I'll update the thread if something goes on.

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      Hello there,

      glad to update on this issue: a recent commit in blead fixed the issue spotted by Anonymous Monk.

      This the comment:

      grow the tmps (mortal) stack exponentially rather than linearly As with the value stack and the save stack, this gives us constant amortized growth per element.

      Thanks to you all and especially to Tony Cook!

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Strawberry: Both IO and global match are VERY SLOW. Unless pre-heated (but why?)
by Discipulus (Canon) on Nov 14, 2023 at 08:47 UTC
    Hello dear Anonymous Monk,

    it took a while, but there are strange facts to notice:

    • v5.12.3 is not affected by pre-heat and it is ~4x faster then others.
    • v5.20.3 v5.22.3 and v5.24.2 are not affected by pre-heat
    • v5.28.0 is ~4x faster without pre-heat (but v5.28.1 no ;)

    # removed 5.26.0 at the top of list because already present | v5.12.3 | Array size: 500000 | split, no pre-heat: 0.837 | split, pre-heat: 0.851 | match, no pre-heat: 0.549 | match, pre-heat: 0.562 | io(list), no pre-heat: 0.462 | io(list), pre-heat: 0.427 | Array size: 1000000 | split, no pre-heat: 2.037 | split, pre-heat: 1.967 | match, no pre-heat: 1.362 | match, pre-heat: 1.380 | io(list), no pre-heat: 1.132 | io(list), pre-heat: 1.158 | Array size: 5000000 | split, no pre-heat: 23.066 | split, pre-heat: 23.281 | match, no pre-heat: 20.987 | match, pre-heat: 20.879 | io(list), no pre-heat: 19.961 | io(list), pre-heat: 19.297 [OK] \perl5.12-32bit\perl\bin\perl.exe | v5.20.3 | Array size: 500000 | split, no pre-heat: 1.088 | split, pre-heat: 1.106 | match, no pre-heat: 0.917 | match, pre-heat: 0.913 | io(list), no pre-heat: 0.832 | io(list), pre-heat: 0.834 | Array size: 1000000 | split, no pre-heat: 3.466 | split, pre-heat: 3.496 | match, no pre-heat: 3.099 | match, pre-heat: 3.058 | io(list), no pre-heat: 3.067 | io(list), pre-heat: 3.469 | Array size: 5000000 | split, no pre-heat: 74.205 | split, pre-heat: 76.273 | match, no pre-heat: 75.415 | match, pre-heat: 73.660 | io(list), no pre-heat: 75.609 | io(list), pre-heat: 75.151 [OK] \perl5.20.64bit\perl\bin\perl.exe | v5.22.3 | Array size: 500000 | split, no pre-heat: 0.386 | split, pre-heat: 0.443 | match, no pre-heat: 0.862 | match, pre-heat: 0.802 | io(list), no pre-heat: 0.793 | io(list), pre-heat: 0.797 | Array size: 1000000 | split, no pre-heat: 0.782 | split, pre-heat: 0.800 | match, no pre-heat: 3.007 | match, pre-heat: 3.173 | io(list), no pre-heat: 3.049 | io(list), pre-heat: 3.077 | Array size: 5000000 | split, no pre-heat: 3.931 | split, pre-heat: 4.047 | match, no pre-heat: 72.992 | match, pre-heat: 81.416 | io(list), no pre-heat: 76.030 | io(list), pre-heat: 76.570 [OK] perl5.22.64bit\perl\bin\perl.exe | v5.24.2 | Array size: 500000 | split, no pre-heat: 0.387 | split, pre-heat: 0.383 | match, no pre-heat: 0.901 | match, pre-heat: 0.953 | io(list), no pre-heat: 0.934 | io(list), pre-heat: 1.038 | Array size: 1000000 | split, no pre-heat: 0.771 | split, pre-heat: 0.817 | match, no pre-heat: 3.739 | match, pre-heat: 3.195 | io(list), no pre-heat: 3.371 | io(list), pre-heat: 3.647 | Array size: 5000000 | split, no pre-heat: 3.854 | split, pre-heat: 3.880 | match, no pre-heat: 76.768 | match, pre-heat: 81.458 | io(list), no pre-heat: 83.648 | io(list), pre-heat: 83.204 [OK] perl5.24.64bit\perl\bin\perl.exe | v5.26.0 | Array size: 500000 | split, no pre-heat: 0.426 | split, pre-heat: 0.389 | match, no pre-heat: 0.878 | match, pre-heat: 0.202 | io(list), no pre-heat: 0.883 | io(list), pre-heat: 0.178 | Array size: 1000000 | split, no pre-heat: 0.816 | split, pre-heat: 0.838 | match, no pre-heat: 3.246 | match, pre-heat: 0.391 | io(list), no pre-heat: 3.038 | io(list), pre-heat: 0.382 | Array size: 5000000 | split, no pre-heat: 3.879 | split, pre-heat: 3.904 | match, no pre-heat: 72.029 | match, pre-heat: 1.882 | io(list), no pre-heat: 72.251 | io(list), pre-heat: 1.843 [OK] \perl5.26.64bit\perl\bin\perl.exe | v5.26.2 | Array size: 500000 | split, no pre-heat: 0.380 | split, pre-heat: 0.404 | match, no pre-heat: 0.824 | match, pre-heat: 0.216 | io(list), no pre-heat: 0.780 | io(list), pre-heat: 0.183 | Array size: 1000000 | split, no pre-heat: 0.782 | split, pre-heat: 0.797 | match, no pre-heat: 3.217 | match, pre-heat: 0.437 | io(list), no pre-heat: 3.294 | io(list), pre-heat: 0.401 | Array size: 5000000 | split, no pre-heat: 4.139 | split, pre-heat: 4.152 | match, no pre-heat: 83.296 | match, pre-heat: 2.216 | io(list), no pre-heat: 84.700 | io(list), pre-heat: 1.997 [OK] \perl-5.26.64bit-PDL\perl\bin\perl.exe | v5.28.0 | Array size: 500000 | split, no pre-heat: 0.408 | split, pre-heat: 0.418 | match, no pre-heat: 0.312 | match, pre-heat: 0.193 | io(list), no pre-heat: 0.286 | io(list), pre-heat: 0.169 | Array size: 1000000 | split, no pre-heat: 0.832 | split, pre-heat: 0.884 | match, no pre-heat: 1.113 | match, pre-heat: 0.420 | io(list), no pre-heat: 0.950 | io(list), pre-heat: 0.385 | Array size: 5000000 | split, no pre-heat: 4.599 | split, pre-heat: 4.683 | match, no pre-heat: 20.415 | match, pre-heat: 2.069 | io(list), no pre-heat: 20.736 | io(list), pre-heat: 1.855 [OK] \perl5.28.32bit\perl\bin\perl.exe | v5.28.1 | Array size: 500000 | split, no pre-heat: 0.379 | split, pre-heat: 0.388 | match, no pre-heat: 0.860 | match, pre-heat: 0.201 | io(list), no pre-heat: 0.814 | io(list), pre-heat: 0.167 | Array size: 1000000 | split, no pre-heat: 0.762 | split, pre-heat: 0.785 | match, no pre-heat: 3.113 | match, pre-heat: 0.415 | io(list), no pre-heat: 3.152 | io(list), pre-heat: 0.340 | Array size: 5000000 | split, no pre-heat: 4.033 | split, pre-heat: 4.285 | match, no pre-heat: 78.298 | match, pre-heat: 2.105 | io(list), no pre-heat: 79.609 | io(list), pre-heat: 1.726 [OK] \perl5.28-64bit\perl\bin\perl.exe | v5.32.0 | Array size: 500000 | split, no pre-heat: 0.424 | split, pre-heat: 0.486 | match, no pre-heat: 0.881 | match, pre-heat: 0.240 | io(list), no pre-heat: 0.845 | io(list), pre-heat: 0.193 | Array size: 1000000 | split, no pre-heat: 0.833 | split, pre-heat: 0.932 | match, no pre-heat: 3.086 | match, pre-heat: 0.427 | io(list), no pre-heat: 3.080 | io(list), pre-heat: 0.416 | Array size: 5000000 | split, no pre-heat: 4.586 | split, pre-heat: 4.738 | match, no pre-heat: 79.045 | match, pre-heat: 2.068 | io(list), no pre-heat: 79.037 | io(list), pre-heat: 2.089 [OK] \perl5.32.64bit\perl\bin\perl.exe

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Strawberry: Both IO and global match are VERY SLOW. Unless pre-heated (but why?)
by tonyc (Friar) on Dec 04, 2023 at 22:40 UTC

    A large part of the growth in time cost is the way the tmps stack was being grown.

    On Linux, at least some BSDs, realloc() for large allocations is handled through munmap() system call, making such allocations close to constant time. On Windows realloc() is O(size of original allocation) when the allocation can't be grown in place, since the data needs to be copied from the original allocation to the new allocation.

    Before fea90cfbe1f221d50be90ca5ceb0c6c7f121e442 the tmps (or mortal) stack was being grown linearly, adding 512 entries each time the stack was grown. This dominated the execution cost for the match and io(list) no-heat cases.

    With fea90cfbe1f221d50be90ca5ceb0c6c7f121e442 the tmps stack is now grown exponentially (1.5**n) like other similar allocations such as the value stack, arrays. This changes the cost to amortized constant time.

    The pre-heat in each case grew the tmps stack in one big allocation, preventing the incremental growth that caused the problem.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11155604]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2025-05-20 09:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.