Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

cryptquote solver

by xChauncey (Scribe)
on Jun 29, 2003 at 07:48 UTC ( [id://269975]=perlquestion: print w/replies, xml ) Need Help??

xChauncey has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,
I'm trying to create a cryptquote solver as a bit of an academic exercise to learn more about various aspects of Perl.

I've come across a roadblock, what I am trying to do is make a regex to check a dictionary file for words that match a pattern given by the encrypted word...Here is a piece of code to check one word against a dictionary file to see if it fits the pattern...(pattern is abcdee)
#!Perl open INPUT, "<dictionary"; open OUTPUT, ">result.txt"; while (<INPUT>){ if ($_ =~ m/^(.)([^\1])([^\1\2])([^\1\2\3])([^\1\2\3\4])\5/i){ print OUTPUT "$&\n"; } }
I'm sure it is easy to see what I am trying to do here, but obviously I have something wrong. Once I get the regex right, I would like to dynamically generate the regex for each word and then map the results so that the next word only looks at words that match the letters seen for the last word (but that's the next phase). This will be a recursive process, so that if the prog hits a roadblock (from thinking that one of the words is something it really isn't) it will go back and try something else.

This is something that I took on as a whim, and now I realize that it is much more complex than I thought. I'm still learning Perl though, and I think that taking on hard things is probably the best way to learn how to handle hard things...

Any advice?
XC

Replies are listed 'Best First'.
Re: cryptquote solver
by blakem (Monsignor) on Jun 29, 2003 at 10:33 UTC
    I did a similar thing once upon a time. Instead of writing one regex to match the wordmask, I calculated the wordmask for each dictionary word and compared it to the given word. Its probably less efficient than your approach, but it was easier to write.

    Since wordmask() is a non-trivial subroutine I've included a few tests for it...

    Run it like so:

    % match_word.pl abcdee
    % match_word.pl elephant
    % match_word.pl run_tests
    #!/usr/bin/perl -wT use strict; my $word = shift || 'abcdee'; my $dict = '/usr/dict/words'; run_tests() if $word eq 'run_tests'; my $mask = wordmask($word); open(my $in, '<', $dict) or die "cant open $dict : $!"; while(<$in>) { chomp; next unless length($_) eq length($mask); my $wordmask = wordmask($_); print "$_\n" if $wordmask eq $mask; } sub wordmask { my $word = shift; $word =~ tr/a-z/A-Z/; return 'ERROR' unless $word =~ /^[A-Z]+$/; my $letter = 'a'; while ($word =~ /([A-Z])/) { $word =~ s/$1/$letter/g; $letter++; } return $word; } sub run_tests { eval "use Test::More tests => 4"; is( wordmask('abc'), 'abc' ); is( wordmask('ally'), 'abbc' ); is( wordmask('ggl'), 'aab' ); is( wordmask('*()'), 'ERROR' ); exit; }

    -Blake

•Re: cryptquote solver
by merlyn (Sage) on Jun 29, 2003 at 15:17 UTC
      Some day when i have some free time (it does exist doesnt it?) i want to expand that pat to take a crypto string of words and find the possible char -> char mappings that produce feasable word dict matched words.

      -Waswas
Re: cryptquote solver
by sauoq (Abbot) on Nov 13, 2003 at 10:50 UTC

    Gee, it seems as if everyone has done this at one time or another... :-)

    My approach avoided regular expressions. I just attempted to build a one-to-one mapping between two words. If I could do it, they "matched" in the sense that one could be transformed to the other. When successful, the following function returns a reference to a hash that shows the mapping required to get from the first word to the second. It returns undef on failure.

    sub possible_match { my ($word1, $word2) = @_; return undef unless length( $word1 ) == length( $word2 ); my ( %one, %two ); for ( 0 .. length( $word1 ) - 1) { my $c1 = substr($word1, $_, 1); my $c2 = substr($word2, $_, 1); return undef if exists $one{ $c1 } and $one{ $c1 } ne $c2; return undef if exists $two{ $c2 } and $two{ $c2 } ne $c1; $one{ $c1 } = $c2; $two{ $c2 } = $c1; } return \%one; }
    For example, possible_match('too', 'see') would return { t => 's', o => 'e' }.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: cryptquote solver
by BrowserUk (Patriarch) on Jun 30, 2003 at 07:07 UTC

    Here's my crack at building your regexes.

    sub genRegex{ my $word = shift; my $re = '(.)'; my @c = split '', $word ; for my $c ( 1 .. $#c ) { my $match = 1+index $word, $c[$i], 0; $re .= $match <= $c[$i] ? '($' . ($match - 1) . ')' : '([^$' . join( '$', 1..$i ) . '])'; } } return qr[$re]; } ... # Build a hash relating cryptoword with a regex to find possible match +es. $soln{$_}{regex} = genRegex $_ for @crypts; open WORDS, '<words.txt' or die $!; # build an HoA of all the possible matches while( chomp( my $word = <WORDS> ) ) { print $word; for( keys %soln ) { push @{ $soln{ $_ }{ possibles } }, $word if $word =~ $soln{$_}{regex}; } }

    Of course, once you have all the possible matches for each of the words in the cryptoquote, then comes the fun of trying to weed out the set(s) of words that use the same mapping function. That's left as an excersise for the reader...mostly because my first attempt would require a Cray to solve it in my lifetime:)


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://269975]
Approved by allolex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2024-04-18 09:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found