Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Efficient walk/iterate along a string

by BrowserUk (Pope)
on Nov 22, 2010 at 23:20 UTC ( #873073=note: print w/ replies, xml ) Need Help??


in reply to Efficient walk/iterate along a string

The fastest way I've found of accessing the characters of a string is to use chop.

Even if you need to copy the string to avoid its destruction; and reverse it to get the characters in the right order, it still comes out substantially quicker than any other method I've tried.

If you can avoid both the copy and the reverse it gets much faster still, but that's not easy to benchmark due to the destructive process. But if you are reading the records one at a time from a file, you're probably going to overwrite the record at each iteration, so that often doesn't matter in a real application.

#! perl -slw use strict; use Benchmark qw[ cmpthese ]; our $string = 'x'x5000; cmpthese -1, { substr => q[ for ( 0..length( $string)-1 ) { my $c = substr $string, $_, 1; } ], splitArray => q[ my @c=split'',$string; for( 0 .. $#c ){ my $c = $c[$_]; } ], splitFor => q[ for( split'', $string ){ my $c = $_; } ], unpack => q[ for( unpack 'C*', $string ) { my $c = chr; } ], reverseChop => q[ my $s = reverse $string; my $c; $c = $_ while chop $s; ], chop => q[ my $s = $string; my $c; $c = $_ while chop $s; ], ramfile => q[ open my $ram, '<', \$string; my $c; $c = $_ while $_ = getc( $ram ); ], }; __END__ C:\test>873068 Rate splitArray splitFor ramfile substr unpack reverseCh +op chop splitArray 169/s -- -48% -74% -81% -83% -8 +9% -89% splitFor 323/s 91% -- -51% -64% -67% -7 +9% -79% ramfile 654/s 288% 103% -- -27% -34% -5 +7% -58% substr 891/s 428% 176% 36% -- -9% -4 +2% -43% unpack 984/s 484% 205% 51% 10% -- -3 +6% -37% reverseChop 1534/s 809% 376% 135% 72% 56% +-- -1% chop 1555/s 822% 382% 138% 74% 58% +1% --

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Efficient walk/iterate along a string
Download Code
Re^2: Efficient walk/iterate along a string
by moritz (Cardinal) on Nov 23, 2010 at 12:54 UTC
    Just a quick warning: if the string contains 0s, the chop solution stops as soon as one is found. Also chop does not set $_ to the removed character.

    A version that fixes both problems, and is still faster than unpack is

    my $s = reverse $string; my $c; $c = chop $s while length $s;

      True, but the extra opcode (length) imposes a 30% hit.

      Of course that diminishes if you're doing anything useful within the loop, but it is still significant for those situations where using such an obscure mechanism is worth considering.

      Luckily, genomic data doesn't usually contain zeros or nulls.

      I think it would be really nice if in the same way that in 5.12 you can use each on arrays, it would be nice to use it on scalars:

      my $string = 'fred'; my( $i, $c ); say "$i:$c" while ($i,$c) = each $string; 0:f 1:r 2:e 3:d say while $_ = each $string; f r e d

      I think that could be made very efficient by aliasing a LvTARG to the characters in situ; and would be very useful.

      IMO far more useful than the single character saving of each $arrayRef; over each @$arrayRef, which unfortunately probably means that it could not now be implemented :(


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://873073]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (11)
As of 2014-08-01 16:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (28 votes), past polls