|Syntactic Confectionery Delight|
Re^3: Inexplicably slow regex (optimizer)by tye (Sage)
|on Sep 13, 2006 at 18:10 UTC||Need Help??|
I think the lookback is what is killing you here.
But the question is why is the "sol" version so much slower than even the "lb" version.
I suspect it all boils down to optimization. All three cases could, in theory, anchor to "\n" characters but, in practice, the optimizer may not be smart enough to realize this.
My interpretation (aka "wild guess") of GrandFather's numbers:
is that the "lb" regex is a bit more complex and so runs a bit slower while the "sol" regex runs so much slower that I'd expect it to be the one which is hitting way too many possible starting points rather than jumping to key spots such as "\n" (even more speed from Boyer-Moore probably doesn't apply here since I don't think any of these regexes are simple enough).
Yes, I've been hoping someone would use -Mre=debug and summarize what it reported on a system that saw the "sol" regex being especially slow. I think that is the most likely route at explaining the "problem". Then it would be interesting to compare that against what it reports for systems that don't see "sol" being so slow.