Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^3: better (faster) way of writing regexp

by ikegami (Pope)
on Dec 02, 2009 at 18:07 UTC ( #810632=note: print w/ replies, xml ) Need Help??


in reply to Re^2: better (faster) way of writing regexp
in thread better (faster) way of writing regexp

I'm not sure why you have the regex engine check if each character is not a newline in isook. Use the "s" modifier!

Interesting about pack. I heard the overhead to start the regex engine went up in 5.10, but it seems rather minor when put into perspective. ...except I can't replicate your results.

use strict; use warnings; use Benchmark qw(:all); print("This is Perl $]\n"); my %tests = ( repeat => 'my ($y,$m,$d) = $date =~ /(\\d\\d\\d\\d)(\\d\\d)(\\d\\d +)/;', range => 'my ($y,$m,$d) = $date =~ /(\\d{4})(\\d{2})(\\d{2})/;', isook => 'my ($y,$m,$d) = $date =~ /(....)(..)(..)/s;', unpack => 'my ($y,$m,$d) = unpack "A4 A2 A2", $date;', ); # These don't result in any opcodes. $_ = 'use strict; use warnings; our $date; '.$_ for values(%tests); our $date = '20091202'; my $results = cmpthese(-3, \%tests);
This is Perl 5.010000 Rate range repeat isook unpack range 405773/s -- -6% -8% -46% repeat 432956/s 7% -- -2% -43% isook 441233/s 9% 2% -- -42% unpack 757010/s 87% 75% 72% -- This is Perl 5.010000 Rate range isook repeat unpack range 398141/s -- -7% -7% -47% isook 427913/s 7% -- -0% -43% repeat 429311/s 8% 0% -- -43% unpack 751802/s 89% 76% 75% -- This is Perl 5.010000 Rate range repeat isook unpack range 415595/s -- -7% -8% -45% repeat 445365/s 7% -- -1% -41% isook 449974/s 8% 1% -- -40% unpack 754290/s 81% 69% 68% --

The faster way seems to be using the capture made of dots as in "isook"

You haven't shown that. Any difference less than 5% should be ignored. It's within the error margin.


Comment on Re^3: better (faster) way of writing regexp
Select or Download Code
Re^4: better (faster) way of writing regexp
by JavaFan (Canon) on Dec 02, 2009 at 18:19 UTC
    # These don't result in any opcodes. $_ = "use strict; use warnings; our $date; $_" for values(%tests);
    Indeed no opcodes, as it doesn't compile. You're interpolating a variable that hasn't been introduced to perl yet.
      It doesn't later either. In other words, they're not executed during Benchmarking.

      Ah, I misread earlier. I had posted the wrong copy of the code. Fixed.

Re^4: better (faster) way of writing regexp
by Marshall (Prior) on Dec 03, 2009 at 07:52 UTC
    Ikegami, I am curious if you can replicate my results with the substr idea plugged into your benchmark code? Update: the reason I ask is that I know you have a very fast 64 bit machine and there could be some differences between my much slower, older 32 bit machine.
    use strict; use warnings; use Benchmark qw(:all); print("This is Perl $]\n"); my %tests = ( repeat => 'my ($y,$m,$d) = $date =~ /(\\d\\d\\d\\d)(\\d\\d)(\\d\\d +)/;', range => 'my ($y,$m,$d) = $date =~ /(\\d{4})(\\d{2})(\\d{2})/;', isook => 'my ($y,$m,$d) = $date =~ /(....)(..)(..)/s;', unpack => 'my ($y,$m,$d) = unpack "A4 A2 A2", $date;', substr => 'my $y = substr($date,0,4);my $m = substr($date,4,2);my +$d = substr($date,6,2);' ); # These don't result in any opcodes. $_ = 'use strict; use warnings; our $date; '.$_ for values(%tests); our $date = '20091202'; my $results = cmpthese(-3, \%tests); __END__ This is Perl 5.010000 Rate range isook repeat unpack substr range 151695/s -- -7% -8% -54% -85% isook 162964/s 7% -- -1% -50% -84% repeat 165314/s 9% 1% -- -49% -84% unpack 326977/s 116% 101% 98% -- -68% substr 1010101/s 566% 520% 511% 209% --

      I know you have a very fast 64 bit machine

      Dream on! My 32-bit machine doesn't even have a virtual second core from hyperthreading.

      I'm not sure what my work machine is, but it's also 32-bit.

      My earlier run was on my work machine. This is on my work machine too:

      This is Perl 5.010000 Rate range repeat isook unpack substr range 416206/s -- -6% -8% -45% -67% repeat 443742/s 7% -- -1% -41% -65% isook 450082/s 8% 1% -- -41% -64% unpack 756739/s 82% 71% 68% -- -40% substr 1257912/s 202% 183% 179% 66% -- This is Perl 5.010000 Rate range repeat isook unpack substr range 415726/s -- -5% -8% -47% -67% repeat 436462/s 5% -- -4% -44% -65% isook 454041/s 9% 4% -- -42% -64% unpack 779486/s 88% 79% 72% -- -38% substr 1262559/s 204% 189% 178% 62% --

      (Threaded 32-bit build on linux)

      Interesting.

        Interesting...

        The ratio's aren't the same as on my machine, but it appears that substr() is pretty quick on both of our machines.

        I don't know a Linux utility for this, but in the Windows world, CPU-Z http://www.cpuid.com/cpuz.php shows a lot of info about processors... It is hard for me to imagine that you don't have at least a multi-threaded processor given the raw speed of your machine for a single thread.

        If I run two "number cruncher" apps at once, the performance is not 2x, but rather like 1.4x. I have old memory technology and my machine becomes memory bound. To Windows XP Pro, my machine looks pretty much like 2 CPU's except that 1+1 != 2, only about 1.4! And of course when I do that, my computer turns into a "space heater"!. When the winter gets colder, I run some BOINC project like seti@home, etc. on the theory that I might as well be doing something at least marginally useful while I am generating heat! Anyway this is what I have:(in the scheme of things, a Prescott is a wimp)

        Processor 1 (ID = 0) Number of cores 1 (max 1) Number of threads 2 (max 2) Name Intel Pentium 4 Codename Prescott Specification Intel(R) Pentium(R) 4 CPU 3.00GHz Package Socket 478 mPGA (platform ID = 2h) CPUID F.4.1 Extended CPUID F.4 Core Stepping E0 Technology 90 nm Core Speed 3015.1 MHz (15.0 x 201.0 MHz) Rated Bus speed 804.0 MHz Stock frequency 3000 MHz Instructions sets MMX, SSE, SSE2, SSE3 L1 Data cache 16 KBytes, 8-way set associative, 64-byte line si +ze Trace cache 12 Kuops, 8-way set associative L2 cache 1024 KBytes, 8-way set associative, 64-byte line size FID/VID Control no

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://810632]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2014-11-27 14:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (184 votes), past polls