Re^4: better (faster) way of writing regexp

Ikegami, I am curious if you can replicate my results with the substr idea plugged into your benchmark code? Update: the reason I ask is that I know you have a very fast 64 bit machine and there could be some differences between my much slower, older 32 bit machine.

use strict;
use warnings;

use Benchmark qw(:all);

print("This is Perl $]\n");

my %tests = (
    repeat => 'my ($y,$m,$d) = $date =~ /(\\d\\d\\d\\d)(\\d\\d)(\\d\\d
+)/;',
    range  => 'my ($y,$m,$d) = $date =~ /(\\d{4})(\\d{2})(\\d{2})/;',
    isook  => 'my ($y,$m,$d) = $date =~ /(....)(..)(..)/s;',
    unpack => 'my ($y,$m,$d) = unpack "A4 A2 A2", $date;',
    substr => 'my $y = substr($date,0,4);my $m = substr($date,4,2);my 
+$d = substr($date,6,2);'
);

# These don't result in any opcodes.
$_ = 'use strict; use warnings; our $date; '.$_
    for values(%tests);

our $date = '20091202';

my $results = cmpthese(-3, \%tests);
__END__
This is Perl 5.010000
            Rate  range  isook repeat unpack substr
range   151695/s     --    -7%    -8%   -54%   -85%
isook   162964/s     7%     --    -1%   -50%   -84%
repeat  165314/s     9%     1%     --   -49%   -84%
unpack  326977/s   116%   101%    98%     --   -68%
substr 1010101/s   566%   520%   511%   209%     --
[download]

Comment on Re^4: better (faster) way of writing regexp Download Code

Replies are listed 'Best First'.
Re^5: better (faster) way of writing regexp by ikegami (Patriarch) on Dec 03, 2009 at 17:42 UTC
I know you have a very fast 64 bit machine Dream on! My 32-bit machine doesn't even have a virtual second core from hyperthreading. I'm not sure what my work machine is, but it's also 32-bit. My earlier run was on my work machine. This is on my work machine too: `This is Perl 5.010000 Rate range repeat isook unpack substr range 416206/s -- -6% -8% -45% -67% repeat 443742/s 7% -- -1% -41% -65% isook 450082/s 8% 1% -- -41% -64% unpack 756739/s 82% 71% 68% -- -40% substr 1257912/s 202% 183% 179% 66% -- This is Perl 5.010000 Rate range repeat isook unpack substr range 415726/s -- -5% -8% -47% -67% repeat 436462/s 5% -- -4% -44% -65% isook 454041/s 9% 4% -- -42% -64% unpack 779486/s 88% 79% 72% -- -38% substr 1262559/s 204% 189% 178% 62% --` [download] (Threaded 32-bit build on linux) Interesting.	[reply] [d/l]
Re^6: better (faster) way of writing regexp by Marshall (Canon) on Dec 03, 2009 at 22:28 UTC
Interesting... The ratio's aren't the same as on my machine, but it appears that substr() is pretty quick on both of our machines. I don't know a Linux utility for this, but in the Windows world, CPU-Z http://www.cpuid.com/cpuz.php shows a lot of info about processors... It is hard for me to imagine that you don't have at least a multi-threaded processor given the raw speed of your machine for a single thread. If I run two "number cruncher" apps at once, the performance is not 2x, but rather like 1.4x. I have old memory technology and my machine becomes memory bound. To Windows XP Pro, my machine looks pretty much like 2 CPU's except that 1+1 != 2, only about 1.4! And of course when I do that, my computer turns into a "space heater"!. When the winter gets colder, I run some BOINC project like seti@home, etc. on the theory that I might as well be doing something at least marginally useful while I am generating heat! Anyway this is what I have:(in the scheme of things, a Prescott is a wimp) Processor 1 (ID = 0) Number of cores 1 (max 1) Number of threads 2 (max 2) Name Intel Pentium 4 Codename Prescott Specification Intel(R) Pentium(R) 4 CPU 3.00GHz Package Socket 478 mPGA (platform ID = 2h) CPUID F.4.1 Extended CPUID F.4 Core Stepping E0 Technology 90 nm Core Speed 3015.1 MHz (15.0 x 201.0 MHz) Rated Bus speed 804.0 MHz Stock frequency 3000 MHz Instructions sets MMX, SSE, SSE2, SSE3 L1 Data cache 16 KBytes, 8-way set associative, 64-byte line si +ze Trace cache 12 Kuops, 8-way set associative L2 cache 1024 KBytes, 8-way set associative, 64-byte line size FID/VID Control no [download]	[reply] [d/l]
Re^7: better (faster) way of writing regexp by ikegami (Patriarch) on Dec 03, 2009 at 22:38 UTC
I "win"! :) My machine, not the one I ran the benchmark on: Number of cores 1 (max 1) Number of threads 1 (max 1) Name Intel Pentium 4 Codename Northwood Specification Intel(R) Pentium(R) 4 CPU 2.66GHz Package (platform ID) Socket 478 mPGA (0x2) CPUID F.2.7 Extended CPUID F.2 Brand ID 9 Core Stepping C1 Technology 0.13 um Core Speed 2004.5 MHz Multiplier x FSB 20.0 x 100.2 MHz Rated Bus speed 400.9 MHz Stock frequency 2666 MHz Instructions sets MMX, SSE, SSE2 L1 Data cache 8 KBytes, 4-way set associative, 64-byte line +size Trace cache 12 Kuops, 8-way set associative L2 cache 512 KBytes, 8-way set associative, 64-byte lin +e size FID/VID Control no [download] From that machine: `This is Perl 5.010001 Rate range repeat isook unpack substr range 117311/s -- -6% -7% -24% -81% repeat 124414/s 6% -- -2% -20% -80% isook 126456/s 8% 2% -- -18% -79% unpack 155069/s 32% 25% 23% -- -74% substr 607361/s 418% 388% 380% 292% --` [download] I'm scared of looking at the stats of my tv computer (aka Frankenputer). It's some kind of P3, for starters.	[reply] [d/l] [select]


We don't bite newbies here... much
	PerlMonks