Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Re: Re: Efficient run determination.

by BrowserUk (Pope)
on Nov 15, 2002 at 10:14 UTC ( #213104=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Efficient run determination.
in thread Efficient run determination.

Removed by author.
  • Comment on Re: Re: Re: Efficient run determination.

Replies are listed 'Best First'.
Re: Re: Re: Re: Efficient run determination.
by PhiRatE (Monk) on Nov 15, 2002 at 11:03 UTC
    As I noted, C is the language for the job. It doesn't really matter how efficient the regex solution, C can work on the data in-place without incurring much in the way of dynamic allocation costs. You have mitigated this disadvantage to an extent in your version by placing certain limitations on the algorithm, notably the 500 unit capture max. This changes the situation considerably. However the design you chose isn't competitive unfortunately :/ primarily I suspect because its not an optimised line within the re engine to be used like this, although that is conjecture sicne i'm not familiar with the internals.

    Length: 1920
    Enlil 2:    0.93203s
    Dingus 1:   0.530455s
    Dingus 2:   0.537765s
    Rasta 1:    1.973259s
    TommyW 1:   0.87257s
    Robartes 1: 0.996671s
    PhiRatE:    0.232084s
    BrowserUk:  2.764623s
    

    edit: note that this is for 100 iterations

    With the code done like this:

    # BrowserUk $t0 = [gettimeofday]; #! Set up big regex. 1-time hit. my $re ='(?:(.)(??{"$+*"}))?' x 500; $re = qr/$re/o; for (1..100) { @res = browseruk($stn); } print "BrowserUk: ".tv_interval( $t0 )."\n"; sub browseruk { $_ = shift; my @c = m/$re/; #! THIS LINE DOES ALL THE WORK. #! This truncates the list to exclude null matches returned from r +egex. $#c = $#- -1; return \@c; }

    I anticipated that perhaps the startup cost of the regex generation might be causing the performance problem, so I ran a 1000 unit test as well, against only my entry.

    Length: 1920
    PhiRatE:    2.241276s
    BrowserUk: 28.384692s
    

    Note that you got similar results, I'm running 100-1000 runs of the same line, you only did one :)

    My recommendation is either go with the Inline C one if you really need the speed, or Dingus' 2nd variant, which is the cleanest, closest perl variant.

      Removed by author.
        The cost of the startup thing is only included once, and even without it it makes no difference, the timing is the same, the startup cost is miniscule in comparison to the cost of the rest of the process.

        If you don't agree with me, feel free to run your own benchmarks, my program is included at the end of this message. Your algorithm, while interesting, is one of the slowest.

        Iterations: 100
        Length: 1920
        PhiRatE 1:  0.236097s  Perl/C
        PhiRatE 3:  0.234754s  Perl/C
        Dingus 1:   0.541398s  Perl
        PhiRatE 2:  0.543576s  Perl
        Dingus 2:   0.580746s  Perl + RE
        TommyW 1:   0.897865s  Perl + RE
        Enlil 2:    0.964746s  Perl + RE
        Robartes 1: 1.021243s  Perl
        Rasta 1:    2.015298s  Perl + RE
        BrowserUk:  2.764815s  Perl + RE
        

        Code for my benchmarking is here. Feel free to fiddle around to your liking.

        use Data::Dumper; use Time::HiRes qw( usleep ualarm gettimeofday tv_interval ); use re 'eval'; $stn = "aaaaaaammm38fdkkkkkkkk3,,,,,,,,,,sad909999999994lkllllllllllll +lz,,,,,,,,,dd888888882jk2kkd8d888d8djkjkjkjkkk3kk4k5kkkk65"; $iterations = 500; for (1..4) { $stn.=$stn; } print "Iterations: $iterations\n";~ print "Length: ".length($stn)."\n"; # Enlil 2 $t0 = [gettimeofday]; for (1..$iterations) { @res = enlil_2($stn); } print "Enlil 2: ".tv_interval( $t0 )."\n"; # Dingus 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = dingus_1($stn); } print "Dingus 1: ".tv_interval( $t0 )."\n"; # Rasta 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = rasta_1($stn); } print "Rasta 1: ".tv_interval( $t0 )."\n"; # TommyW 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = tommyw_1($stn); } print "TommyW 1: ".tv_interval( $t0 )."\n"; # Robartes 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = robartes_1($stn); } print "Robartes 1: ".tv_interval( $t0 )."\n"; # PhiRatE 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = p_process($stn); } print "PhiRatE 1: ".tv_interval( $t0 )."\n"; # BrowserUk $t0 = [gettimeofday]; #! Set up big regex. 1-time hit. my $re ='(?:(.)(??{"$+*"}))?' x 500; $re = qr/$re/o; for (1..$iterations) { @res = browseruk($stn); } print "BrowserUk: ".tv_interval( $t0 )."\n"; # Dingus 2 $t0 = [gettimeofday]; for (1..$iterations) { @res = dingus_2($stn); } print "Dingus 2: ".tv_interval( $t0 )."\n"; # PhiRatE 2 $t0 = [gettimeofday]; for (1..$iterations) { @res = phirate_2($stn); } print "PhiRatE 2: ".tv_interval( $t0 )."\n"; # PhiRatE 3 $t0 = [gettimeofday]; for (1..$iterations) { @res = p_process_2($stn); } print "PhiRatE 3: ".tv_interval( $t0 )."\n"; sub browseruk { $_ = shift; my @c = m/$re/; #! THIS LINE DOES ALL THE WORK. #! This truncates the list to exclude null matches returned from r +egex. $#c = $#- -1; return \@c; } sub enlil_2 { my $string = shift; my @bah; while ($string =~ /((.)\2*)/g) { push (@bah, [$2,$-[1],$+[1] - $-[1]]); } return \@bah; } sub dingus_1 { my $string = shift; my (@res, $c, $p, $i); $p = 0; $c = substr($string,$p,1); for ($i=1; $i<length($string); $i++) { next if ($c eq substr($string,$i,1)); push (@res, [$c,$p,($i-$p)]); $c = substr($string,$i,1); $p = $i; } push (@res, [$c,$p,($i-$p)]); return \@res; } sub dingus_2 { my $string = shift; my (@res, $i); $i = 0; while ($string =~ /(.)\1*/g) { push (@res, [$1, $i, pos($string)-$i]); $i = pos($string); } return \@res; } sub rasta_1 { my $string = shift; my ($pp, $l, @res); $l = length($string); $pp = 0; while ($pp < $l) { $c = substr $string, $pp, 1; if ($string =~ /\G\Q$c\E+/gc) { push @res,[$c,$pp,pos($string) - $pp]; $pp = pos($string); } } return \@res; } sub tommyw_1 { my $string = shift; my $pos=0; my @triples=(); my @reps=$string=~/((.)\2*)/g; while (@reps) { my $hits=shift @reps; my $char=shift @reps; push @triples, [$char, $pos, length $hits]; $pos+=length $hits; } return \@triples; } sub robartes_1 { my $string = shift; my @res; my @listedstring= split//,$string; my $prev=shift @listedstring; my $currstart=my $index=0; for (@listedstring) { if ($_ eq $prev) { $index++; } else { push @res, [$prev, $currstart, $index-$currstart+1]; $currstart=++$index; $prev=$_; } } push @res, [$prev, $currstart, $index-$currstart+1]; return \@res; } sub phirate_2 { $_ = shift; my @res; my $count=0; my ($prev, $next); my $i=0; $prev = $next = chop($_); while ($next || $prev) { if ($prev eq $next) { $count++; } else { push @res,[$prev, $i=$count, $count]; $prev = $next; $count = 1; } $i++; $next = chop; } return \@res; } use Inline C => <<'END_OF_C_CODE'; void p_process(char *s) { char prev = 0; long count = 0; long pos = 0; long i=0; AV *array; Inline_Stack_Vars; Inline_Stack_Reset; while((*s != 0) || (prev != 0)) { if (count==0) { pos = i; prev = *s; count = 1; } else if (prev == *s) { count++; } else { array = newAV(); av_push(array,newSVpvn(&prev,1)); av_push(array,newSViv(pos)); av_push(array,newSViv(count)); Inline_Stack_Push(newRV_inc(array)); pos=i; prev = *s; count=1; } i++; s++; } Inline_Stack_Done; } void p_process_2(char *s) { char prev = 0; long count = 0; long i=0; AV *array; Inline_Stack_Vars; Inline_Stack_Reset; prev = *s; while((*s != 0) || (prev != 0)) { if (prev == *s) { count++; } else { array = newAV(); av_push(array,newSVpvn(&prev,1)); av_push(array,newSViv(i-count)); av_push(array,newSViv(count)); Inline_Stack_Push(newRV_inc(array)); prev = *s; count=1; } i++; s++; } Inline_Stack_Done; } END_OF_C_CODE

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://213104]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (10)
As of 2019-05-27 11:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you enjoy 3D movies?



    Results (156 votes). Check out past polls.

    Notices?