Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Is foreach split Optimized?

by Laurent_R (Canon)
on Jul 09, 2017 at 11:56 UTC ( #1194601=note: print w/replies, xml ) Need Help??


in reply to Is foreach split Optimized? (Update: No.)

Hi haukex,

You might want to also try using a file handle opened on a reference to your string:

filehandle => sub { my @lines; open my $str_fh, "<", \$str or die "cannot open fh $!"; while (<$str_fh>) { chomp; s/o/i/g; push @lines, $_; } },
This appears to be faster than your other subs, including the one with split (note that I had to change the count value for the result to be meaningful):
$ perl bench_split.pl s/iter regex index split filehandle regex 3.44 -- -9% -36% -48% index 3.13 10% -- -29% -43% split 2.21 56% 42% -- -19% filehandle 1.78 93% 76% 24% --

Replies are listed 'Best First'.
Re^2: Is foreach split Optimized?
by haukex (Bishop) on Jul 09, 2017 at 12:18 UTC

    I didn't think of that one, thanks! What version of Perl are you using? Unfortunately on my machine with v5.24.1, the filehandle method is still slightly slower than split:

    5.024001 Rate regex index filehandle split regex 9.65/s -- -7% -34% -39% index 10.4/s 8% -- -29% -34% filehandle 14.6/s 51% 40% -- -7% split 15.7/s 63% 51% 8% --

    I haven't yet gotten around to experimenting with other versions of Perl.

      This was with version 5.14, using Cygwin. I will try with a more recent version under Windows.

      Update This very strange: running the exact same program on the same computer but under a pure Windows setting and perl 5, version 24, subversion 1 (v5.24.1) and from a bash console, I obtain awfully bad results for the file handle solution:

      s/iter filehandle regex index split filehandle 100 -- -97% -97% -98% regex 3.50 2758% -- -14% -42% index 3.02 3217% 16% -- -33% split 2.02 4868% 74% 50% --
      I can only suspect that there might be something wrong with the management of Windows end-of-line pairs of characters.

      Still under Windows, same program, from a dos windows, with version v5.16.3:

      s/iter regex index split filehandle regex 3.69 -- -2% -29% -39% index 3.63 2% -- -27% -38% split 2.63 40% 38% -- -15% filehandle 2.25 64% 61% 17% --
      I also tried (again on Cygwin with Perl version v5.14.4) this additional sub:
      filehandle2 => sub { open my $str_fh, "<", \$str or die "cannot open fh $!"; my @lines = <$str_fh>; for (@lines) { chomp; s/o/i/g; ; } },
      but the result is not so good as the first filehandle solution (but still better than the others):
      $ perl bench_split.pl s/iter regex index split filehandle2 fi +lehandle regex 3.44 -- -10% -35% -45% + -48% index 3.11 11% -- -28% -39% + -42% split 2.24 53% 39% -- -16% + -20% filehandle2 1.89 82% 65% 19% -- + -5% filehandle 1.80 92% 73% 25% 5% + --
      Finally, I also tried this solution with a map:
      filehandle3 => sub { open my $str_fh, "<", \$str or die "cannot open fh $!"; my @lines = map { chomp; s/o/i/g } <$str_fh>; },
      and thought it might be faster, but this turns out to be slower than all the other solutions:
      $ perl bench_split.pl s/iter filehandle3 regex index split fi +lehandle filehandle3 3.64 -- -7% -16% -39% + -51% regex 3.38 8% -- -9% -34% + -47% index 3.07 18% 10% -- -28% + -42% split 2.22 64% 52% 38% -- + -20% filehandle 1.77 105% 90% 73% 25% + --

      Here's some output on many Perls :)

      perlbrew exec bench_script.pl perl-5.10.1 ========== Rate regex index split filehandle regex 3.95/s -- -7% -40% -53% index 4.27/s 8% -- -35% -50% split 6.54/s 66% 53% -- -23% filehandle 8.46/s 114% 98% 29% -- perl-5.12.5 ========== Rate index regex split filehandle index 3.96/s -- -10% -39% -49% regex 4.41/s 11% -- -32% -43% split 6.50/s 64% 47% -- -16% filehandle 7.69/s 94% 74% 18% -- perl-5.14.4 ========== Rate index regex split filehandle index 3.67/s -- -14% -47% -49% regex 4.25/s 16% -- -38% -41% split 6.86/s 87% 62% -- -5% filehandle 7.25/s 97% 71% 6% -- perl-5.16.3 ========== Rate index regex filehandle split index 3.23/s -- -19% -45% -50% regex 3.98/s 23% -- -32% -38% filehandle 5.83/s 81% 46% -- -9% split 6.44/s 100% 62% 10% -- perl-5.18.4 ========== Rate index regex split filehandle index 3.62/s -- -1% -45% -46% regex 3.65/s 1% -- -44% -46% split 6.57/s 82% 80% -- -3% filehandle 6.76/s 87% 85% 3% -- perl-5.20.3 ========== Rate index regex split filehandle index 3.50/s -- -10% -44% -45% regex 3.90/s 11% -- -38% -38% split 6.25/s 79% 60% -- -1% filehandle 6.31/s 80% 62% 1% -- perl-5.22.3 ========== Rate index regex split filehandle index 3.62/s -- -0% -38% -48% regex 3.64/s 0% -- -38% -47% split 5.88/s 63% 62% -- -15% filehandle 6.90/s 91% 90% 17% -- perl-5.24.1 ========== Rate regex index filehandle split regex 2.97/s -- -21% -42% -59% index 3.77/s 27% -- -26% -47% filehandle 5.08/s 71% 35% -- -29% split 7.18/s 142% 90% 41% -- perl-5.26.0 ========== Rate index regex split filehandle index 3.00/s -- -25% -49% -53% regex 3.98/s 33% -- -33% -37% split 5.91/s 97% 49% -- -6% filehandle 6.31/s 111% 59% 7% --

        Thank you Steve and Laurent! It seems that I was testing on one of the few versions of Perl where the filehandle version is a bit slower. It does make sense that the filehandle version is faster, considering that it doesn't actually split the string (and $/ being a fixed string instead of a regex might contribute a little bit). So I think the filehandle version is clearly the best :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1194601]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2021-09-27 03:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?