Re^5: how to count the number of repeats in a string (really!)

by oha (Friar)
on Nov 15, 2007 at 11:22 UTC

in reply to Re^4: how to count the number of repeats in a string (really!)
in thread how to count the number of repeats in a string (really!)

Admittedly, it would be interesting to see how the benchmark goes with different data sets...
I used the following text, not big but should be enough
my $s = 'Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.'; $s = s/[\r\n]//g;
After reading the code of ikegami and lodin, "I personally believe"d they were faster but i was wrong:
s/iter ikegami lodin oha ikegami 1.43 -- -1% -48% lodin 1.42 1% -- -48% oha 0.740 93% 92% --
I suspect the regex engine is so much smarter in finding fixed char repetition to gain more then the gain of doing lots regex call...


PS: i didn't passed the string to the subs and didn't returned the results.

Update: removed the \n from the string, updated results

Update: my code is broken: it must reset pos to the previous pos+1 (if not some subpatterns aren't matched). updated results are as "i personally believe"d:

s/iter oha lodin ikegami oha 1.66 -- -12% -14% lodin 1.46 14% -- -2% ikegami 1.43 16% 2% --

Node Type: note
