Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: Word Comparison

by hipowls (Curate)
on Mar 27, 2008 at 05:01 UTC ( #676633=note: print w/replies, xml ) Need Help??

in reply to Word Comparison

Now a regex variant;) I don't know if you'd want to use it but it does demonstrate building a regex at run time.

while ( my $line = <DATA> ) { my ($new, $old) = split /\s+/, $line; my $rx = regex($new); print "$new can ", ( $old =~ /$rx/? '': 'NOT ' ), "be made from $o +ld\n"; } sub regex { my %letters; $letters{$_}++ for split //, lc shift; my $regex = join '', map { "(?=(?:.*${_}){$letters{$_}})" } keys % +letters; return qr/\A$regex/si; } __DATA__ dog good food fodder lot total fuse useful poor porridge root rotor __END__ dog can be made from good food can NOT be made from fodder lot can be made from total fuse can be made from useful poor can NOT be made from porridge root can be made from rotor
I'll show the regular expression generated from 'root' in detail
my $regex = qr{ \A # anchor at start for efficiency (?= # positive lookahead for (?: .* # anything r # with an 'r' after it ){1} # at least once ) (?= # positive lookahead for (?:.*t){1} # a 't' ) (?= # positive lookahead for (?:.*o){2} # for two 'o's ) }six; # ignore case
All the lookaheads have to succeed for the regex to match.

On a side note I benchmarked the variants (?:.*r), (?:[^r]*+r) and (?:[^r*]r) on different input strings and found the first was usually fastest.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://676633]
and the shadows deepen...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2017-05-28 15:27 GMT
Find Nodes?
    Voting Booth?