Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^2: Why machine-generated solutions will never cease to amaze me

by grinder (Bishop)
on Nov 10, 2004 at 15:33 UTC ( #406692=note: print w/replies, xml ) Need Help??

in reply to Re: Why machine-generated solutions will never cease to amaze me
in thread Why machine-generated solutions will never cease to amaze me

how it handles the following

Well, plugging in that list, I get:


Regexp::Optimizer arrives at the same result, more or less:


In other words, there were no common tails in any of the alternations. Remember, it's the forward-looking tree that drives things. Tail folding is a secondary consideration. Pretty freaky that the alternation orders came out the same, though.

The main problem for me is that a pure Perl solution is useless, since I need to write the patterns for use in another program that uses the PCRE library (otherwise there is the excellent Text::Trie module).

If the first character of an alternation was an index into a jump table, the cost becomes O(1), at the expense of a slight increase in memory, but as if perl ever cared about that :)

Im really interested in how you are doing your thing

My approach is to take the words, insert them into a tree, and then in a second pass do a depth-first search, reversing the leaves and inserting the reversed leaves into another tree. If I get a tree back (as opposed to a forest), I percolate the common trunk upwards towards the root. When there's a subtree has nothing in common I just unreverse the subtree and unwind the results.

- another intruder with the mooring of the heart of the Perl

Replies are listed 'Best First'.
Re^3: Why machine-generated solutions will never cease to amaze me
by demerphq (Chancellor) on Nov 10, 2004 at 16:14 UTC

    Ah ok, your implementation appears to do pretty much the same thing as mine, disabling common suffix handling when there are multiple suffixes. (I strived _hard_ with little luck to overcome that problem). Just in case you havent thought of it already do you collapse things like /fo(?:o|a)l/ to /fo[oa]l/?

    I have to admit ive never used a module for this stuff. Ive always handrolled. :-) I have a specialized Inline::C Trie matcher for digits that is thousands of times faster than the regex engine (when match times alone are relevent).


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://406692]
[ovedpo15]: Asking for the wisdom of Perl Monks in the following question about JSON module:
[ovedpo15]: http://www. node_id=1213478
[LanX]: it's considered spamming to advertise threads here
[LanX]: just wait

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2018-04-24 15:32 GMT
Find Nodes?
    Voting Booth?