Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^3: Analysis of Regular Expressions

by Ratazong (Monsignor)
on Mar 18, 2010 at 07:02 UTC ( [id://829332]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Analysis of Regular Expressions
in thread Analysis of Regular Expressions

Hi PetaMem!

Thanks for that additional information! Chatbots are a really interesting topic!

Looking at your specific problem (and not the general generalicity of regexes) makes me wonder if the empirical approach (do a check on how many possible strings are matched by the regex) really doesn't work. However with some modifications:

  • Your problem doesn't seem to be real-time - and the order of your rules seem to be static:
    therefore you could do the generalicity-rating in some offline preprocessing phase, e.g. by letting your computer work on it all night/weekend/holliday
  • Instead of looking at all possible input strings, just narrow it to the expected data. You will probably have tons of chat-logs which you can use
    So your rating could be: how many matches are found by a regex in a given set of logs

Regarding the high number of possible characters (as also mentioned by JavaFan): Here you could do some preprocessing, e.g. by replacing the german A-Umlaut by ae. This will also help with some other "languages" like leetspeak ... your chatbot seems to get confused when greeted by a friendly "h3110" ;-)

However I fear that any automatic ordering would just be one criteria for determining the order of the rules. You will probably add additionally some rating done by a human. And the idea of giving additional context to the rule (like you wrote: try this before ruleX ...) sounds great to me. Have you also experimented with randomness? (Using a random order in case of several rules have a similar rating.) That way the answers might not be so predictable...

HTH, Rata

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://829332]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2026-04-13 04:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.