in reply to Re: Benchmarks aren't everything
in thread Benchmarks aren't everything

Thinking about it further, I think it is possible to implement this solution with less overhead. If your goal is merely to prevent disasters, then the right optimization is one which is fast in the common case and makes the bad case run, not one which tries to make the bad case fast as well.

An example of how one might do that is have a flag in every RE node saying whether we are tracking positions in the string that we've been to. If we are, then have a linked list that you scan to find whether your current position has been reached yet. If you're not, then do nothing. Then insert code that once every large number of steps sets this bit in various nodes. (For instance the check might happen when you fail back to a second alternative - do that 10,000 times and then set the bit on your current node.)

On fast regular expressions all you do is check a flag. On slow ones, you run a lot slower (a linked list is a lot of overhead), but at least you still run.