Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Re: Re: Re: string manipulation

by extremely (Priest)
on Mar 30, 2001 at 03:31 UTC ( #68234=note: print w/replies, xml ) Need Help??

in reply to Re: Re: Re: string manipulation
in thread string manipulation

Post your code for this Benchmark and I'll show you where it went wrong. I'd bet on the variable you tested against not being in scope inside the benchmark sub/evals.

My results, Linux on an IBM Netfinity (Intel)

Benchmark: running regexp, transl, each for at least 10 CPU seconds... regexp: 10 wallclock secs (10.59 usr + 0.00 sys = 10.59 CPU) @ 37 +274.69/s (n=394739) transl: 13 wallclock secs (10.46 usr + 0.00 sys = 10.46 CPU) @ 31 +3981.07/s (n=3284242) Rate regexp transl regexp 37275/s -- -88% transl 313981/s 742% --

That is a significant differential there for this simple task. A full regexp engine is a big thing to throw at a lightweight string scan. My benchmark code follows:

use strict; use Benchmark qw(cmpthese); use vars qw( $x ); $x = 'This-is-a-test-string-I-just-typed-in-for-fun'; cmpthese (-10, { 'transl' => '$x =~ tr/-/_/; $x =~ tr/_/-/;', 'regexp' => '$x =~ s/-/_/g; $x =~ s/_/-/g;', } );

Oh yeah, I sure am happy Benchmark exists too. =)

Doh! Update: that assignment was:
$x = 'This_is_a_test_string_I_just_typed_in_for_fun';
It wasn't result impacting, just stupid since it no-ops half my test. Interestingly, if I change the string to one with spaces rather than the '-' or '_' I wind up with regexp being 50-60% faster at doing nothing but scanning with no changes...

$you = new YOU;
honk() if $you->love(perl)

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: string manipulation
by Xxaxx (Monk) on Mar 30, 2001 at 03:56 UTC
    Hey Extremely, Thanks for taking a look at the code and letting me know where the benchmark may have messed up:

    #!/usr/local/bin/perl -w use strict; use Benchmark; my $count =500000; ## Method number one sub One { my $data = 'for bar baz'; my($outstring); ($outstring = $data) =~ tr/-/_/; } ## Method number two sub Two { my $data = 'for bar baz'; my($outstring) = $data; $outstring =~ s/-/_/g; } ## We'll test each one, with simple labels timethese ( $count, {'Method One TR' => '&One', 'Method Two S' => '&Two', } ); exit;

      Your test strings don't have any '-' hyphens in them. =) And you change assignment forms between the two subs which honestly shouldn't have much effect but is still questionable practice when benchmarking since you want the code identical in nature except for the key point you are testing...

      Also, since the $data is set in the sub it will be reset every pass. As well, you can just set up the benchmark like this and avoid having perl eval a sub call in a string:

      timethese ( $count, { 'Method One S' => sub { my $data = 'foo-bar-baz'; $data =~ s/-/_/g; } 'Method Two TR' => sub { my $data = 'foo-bar-baz'; $data =~ tr/-/_/; } } );

      Not a big deal but it may save you some typing in the future. Your way has the benefit of being easy to run the subs once and test their output, tho...

      BTW, I moronically goofed my test string too... =)

      $you = new YOU;
      honk() if $you->love(perl)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://68234]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2018-03-17 14:51 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (224 votes). Check out past polls.