|Perl: the Markov chain saw|
Benchmarking "Are all these characters in this sentence?"by RMGir (Prior)
|on Aug 28, 2008 at 22:52 UTC||Need Help??|
(Edit: Updated based on comments below, and cleaned up. I've broken out the tests from the benchmark, done the tests using Test::More, and factored out the list of test cases.
I've also added a few more tests from tassilo, although some of them failed the tests and are commendted out for now. )
Another Edit: repellent wins!! unpack is the way to go...
_Another_^3 Edit: BrowserUK wins! Well, after the fix so it isn't so destructive of the testArray.
A variant of BrowserUK's approach that uses substr comes in second, but a long ways behind chop.
_Another_^4 Edit: BrowserUK wins again!
buk2 is by far the fastest, but fails a "0", "0" test, so buk2_len which is a bit slower is the real winner, still 10 times faster than the competition for some test sets.
For REALLY fast routines being benchmarked, copying args is clearly a very bad idea. However, other routines like tye0 to tye2 which were the fastest for a while don't benefit as much from removing the argument copies (see the _opt variants in the results).
_Another_^5 Edit: BrowserUK wins again (again)!
buk3 is by far the fastest and passes all the tests
In check if all certain chars are found in a sentence, tallulah asked for improvements on a routine trying to check whether a given sentence contains all of a set of specified characters.
Many monks suggested solutions, with way more than one way to do it.
Here are the results of some benchmarks on all of the suggested approaches on my Macbook Pro, using the provided 5.8.8 build.
I was quite surprised by the results, actually... I'd have bet good money that the slice approach was going to beat anything else, unless Tanktalus's List::MoreUtils approaches won. I promise, I was going to post this no wonder which implementation won :)
Where a solution was regex based, I also added a variant using "study", to see how that impacted performance.
If anyone wants to add a few test cases then run the tests, invoke the script with a -t argument.
Detailed results behind readmore, to fix Meditations page formatting:
Short Rate tallulah_OriginalPost tallulah_OriginalPost 105/s -- Tanktalus_AllRegex_Study 127/s 22% Tanktalus_AllRegex 130/s 24% varian_hash 146/s 40% RMGir_slice 207/s 98% moritz_BuildRegex_WithStudy 243/s 133% moritz_BuildRegex 261/s 149% ysth_loookahead 350/s 234% Tanktalus_AllIndex 411/s 293% unpack_allindex 444/s 324% unpack_allrindex 448/s 328% tassilo_listutils_r 581/s 455% tye2 655/s 525% tye1 661/s 531% RMGir_index 667/s 537% repellent_unpack 770/s 635% repellent_unpack_opt 792/s 657% tye2_opt 807/s 671% tye1_opt 815/s 678% tye0 829/s 692% tye0_opt 830/s 693% buk_substr 1152/s 1000% buk 1829/s 1648% buk2_len 4443/s 4144% buk2 4610/s 4305% buk3 5144/s 4814% LongShort Rate varian_hash varian_hash 4.90/s -- RMGir_slice 5.88/s 20% tallulah_OriginalPost 168/s 3319% Tanktalus_AllRegex_Study 173/s 3422% Tanktalus_AllRegex 209/s 4171% moritz_BuildRegex_WithStudy 243/s 4867% moritz_BuildRegex 396/s 7983% ysth_loookahead 575/s 11640% Tanktalus_AllIndex 731/s 14822% unpack_allrindex 799/s 16201% unpack_allindex 821/s 16650% tassilo_listutils_r 914/s 18549% RMGir_index 1017/s 20652% tye2 1018/s 20671% tye1 1046/s 21234% repellent_unpack 1158/s 23518% repellent_unpack_opt 1207/s 24515% tye2_opt 1386/s 28180% tye0_opt 1399/s 28442% tye0 1414/s 28746% tye1_opt 1440/s 29269% buk_substr 1761/s 35815% buk 2849/s 58012% buk2_len 9489/s 193482% buk2 9686/s 197486% buk3 11005/s 224395% ShortLong Rate tallulah_OriginalPost tallulah_OriginalPost 3.20/s -- Tanktalus_AllRegex 4.42/s 38% Tanktalus_AllRegex_Study 4.42/s 38% moritz_BuildRegex_WithStudy 5.36/s 67% moritz_BuildRegex 5.41/s 69% varian_hash 8.18/s 156% ysth_loookahead 9.26/s 189% RMGir_index 14.4/s 351% tassilo_listutils_r 15.9/s 396% RMGir_slice 15.9/s 396% Tanktalus_AllIndex 16.0/s 401% repellent_unpack_opt 16.8/s 425% repellent_unpack 17.1/s 435% unpack_allrindex 20.0/s 525% unpack_allindex 20.2/s 531% tye2_opt 25.5/s 696% tye0_opt 25.7/s 703% tye0 25.7/s 704% tye1_opt 26.2/s 719% tye2 26.9/s 739% tye1 27.4/s 755% buk_substr 64.0/s 1900% buk 67.9/s 2021% buk2_len 888/s 27649% buk2 939/s 29247% buk3 1109/s 34553% LongLong Rate tallulah_OriginalPost tallulah_OriginalPost 3.20/s -- Tanktalus_AllRegex_Study 4.46/s 40% Tanktalus_AllRegex 4.50/s 41% varian_hash 4.59/s 43% moritz_BuildRegex 5.41/s 69% moritz_BuildRegex_WithStudy 5.41/s 69% RMGir_slice 6.80/s 112% ysth_loookahead 9.40/s 194% RMGir_index 12.9/s 302% repellent_unpack_opt 15.1/s 372% repellent_unpack 15.6/s 387% tassilo_listutils_r 16.0/s 399% Tanktalus_AllIndex 16.0/s 401% unpack_allindex 20.4/s 537% unpack_allrindex 20.6/s 543% tye1_opt 25.9/s 708% tye0 26.0/s 711% tye2_opt 26.0/s 711% tye0_opt 26.2/s 719% tye2 27.3/s 752% tye1 27.3/s 752% buk_substr 63.6/s 1889% buk 67.3/s 2002% buk2_len 852/s 26537% buk2 896/s 27908% buk3 1017/s 31690% VeryLong Rate varian_hash varian_hash 8.41/s -- RMGir_slice 10.2/s 21% tallulah_OriginalPost 189/s 2152% Tanktalus_AllRegex_Study 205/s 2332% Tanktalus_AllRegex 245/s 2810% moritz_BuildRegex_WithStudy 346/s 4019% moritz_BuildRegex 508/s 5942% ysth_loookahead 800/s 9411% Tanktalus_AllIndex 1090/s 12861% unpack_allrindex 1184/s 13980% tassilo_listutils_r 1207/s 14245% unpack_allindex 1230/s 14521% tye2_opt 1267/s 14963% tye1_opt 1268/s 14974% tye0 1279/s 15106% tye2 1280/s 15118% RMGir_index 1280/s 15118% tye0_opt 1292/s 15264% tye1 1297/s 15323% repellent_unpack 1506/s 17801% repellent_unpack_opt 1527/s 18058% buk_substr 2634/s 31219% buk 2830/s 33543% buk2_len 24093/s 286343% buk2 25048/s 297687% buk3 28768/s 341917%