Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Performance problems on splitting long strings

by Kenosis (Priest)
on Jan 30, 2014 at 20:02 UTC ( [id://1072722]=note: print w/replies, xml ) Need Help??


in reply to Performance problems on splitting long strings

Just fyi:

use strict; use warnings; use Tie::CharArray; use Benchmark qw/cmpthese/; my $string = join '', 'A' .. 'Y'; sub _unpack { my @arr = unpack '(A5)*', $string; } sub _regex { my @arr = $string =~ /.{5}/g; } sub _split { my @arr = split /.{5}\K/, $string; } sub _substr { my @arr; for ( my $i = 0 ; $i < length $string ; $i += 5 ) { push @arr, substr $string, $i, 5; } } sub _open { my @arr; open my $sh, '<', \$string; while ( read $sh, my $chars, 5 ) { push @arr, $chars; } } cmpthese( -5, { _unpack => sub { _unpack() }, _regex => sub { _regex() }, _split => sub { _split() }, _substr => sub { _substr() }, _open => sub { _open() } } );

Output:

Rate _open _regex _substr _split _unpack _open 265986/s -- -53% -55% -57% -70% _regex 563780/s 112% -- -5% -8% -36% _substr 593788/s 123% 5% -- -3% -33% _split 612001/s 130% 9% 3% -- -31% _unpack 881949/s 232% 56% 49% 44% --

Replies are listed 'Best First'.
Re^2: Performance problems on splitting long strings
by SimonPratt (Friar) on Jan 31, 2014 at 15:39 UTC

    Borrowing heavily from Kenosis' code (thanks), regex seems to be faster than unpack (at least using substitution):

    Rate _substr _unpack _regex _split _substr 2187335/s -- -11% -16% -20% _unpack 2457294/s 12% -- -6% -10% _regex 2612321/s 19% 6% -- -4% _split 2726283/s 25% 11% 4% --

    Perl code:

    use strict; use warnings; use Benchmark qw/cmpthese/; my $string = join '', 'A' .. 'Y'; sub _unpack { my @arr = unpack '(A5)*', $string; } sub _regex { my @arr; while (length $string){ $string =~ s/^(.{5})//; push @arr, $1; } } sub _split { my @arr = split /.{5}\K/, $string; } sub _substr { my @arr; for ( my $i = 0 ; $i < length $string ; $i += 5 ) { push @arr, substr $string, $i, 5; } } cmpthese( -5, { _unpack => sub { _unpack() }, _split => sub { _split() }, _substr => sub { _substr() }, _regex => sub { _regex() } } );

      Your benchmark is totally broken.

      When your _regex() function runs the first time, it complete destroys $string; and everytime after that the regex is operating on an empty string and thus runs very quicly. Ditto, every other test that runs after the first run of _regex().


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1072722]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (2)
As of 2024-04-24 23:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found