Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
This is a real challenge (at least for me).
Finding (multiple) words that are anagrams of a given phrase is much more complex than finding the simple anagrams. However, we can do something, at least to find up to three words corresponding to a given input.
-------------------------------------------------------- calculate inputphrase signature while (input available) read word skip if words letters not in inputphrase calculate signature and anagrams for each candidate word # first pass for each candidate word # second pass for each candidate word # third pass combine candidate signatures skip if different from inputphrase signature add to anagrams list print anagrams list --------------------------------------------------------
Example: given a list containing the following words
cere cheer come cree echo eeoc em he here herm home me mere moe re reme rhee rho
We could issue a command
perl "come here" < words
and we would get this result:
[come] (here rhee) [echo] (mere reme) [eeoc] (herm) [home] (cere cree) [moe] (cheer) [re] (come) <he> [re] (echo) <em me> [rho] (emcee)
From which it is easy to pick "cheer Moe" or "mere echo" or "re: echo me"
Notice that the words have different parentheses: "[]" come from the first pass, "()" from the second and "<>" from the third one. It means that, to have your complete anagram, you should pick one word from each different parenthesis.
Using the same word list mentioned in the main node, I got these:
monk friar norm fakir marin fork rank of rim koran firm lithographic alight orphic goliath chirp high tropical oligarch pith pig haircloth tromp kernel monk saint main knots mason knit Perl forever lover prefer reprover elf repel fervor Use Perl forever profuse reveler reefer overplus reprove refuels sleeper fervour wait for experience firepower exitance son of a gun snafu goon GNU on sofa Brother Tilly try their boll try other bill lit Tyrol herb
Enough chatting. Here's the code.
#!/usr/bin/perl -w use strict; my $phrase = shift || die "input phrase required\n"; my $outer_limit = shift || 1500; my $inner_limit = shift || 100; $phrase = lc $phrase; $phrase =~ tr/a-z//cd; # considers only alpha characters my @input_letters = split //, $phrase; my $signature = join "", sort @input_letters; my %words = (); my %compare_template; for (@input_letters) {$compare_template{$_}++}; INPUT: while (<>) { chomp; $_ = lc $_; my @letters = split //, $_; my $windex = join "", sort @letters; my %compare = %compare_template; for my $let (@letters) { next INPUT unless (exists $compare{$let}) # keeps only words made of and $compare{$let}--; # signature letters } if (exists $words{$windex}) { next if $words{$windex} =~ /\b$_\b/; $words{$windex} .= " "; } $words{$windex} .= $_; } my $items = scalar keys %words ; print STDERR "Considering $items items. "; if ($items > $outer_limit) { print "Too many candidates. It would take too long\n"; exit; } print STDERR @{[$items > $inner_limit ? "Only two" : "Three"]}, " passes\n"; my @candidates = keys %words; my @used = (); # stores the combination of words already found for my $first (0 .. $#candidates) { if ($signature eq $candidates[$first]) { print " [" . $words{$candidates[$first]} . "]\n"; push @used , [$first, -1,-1]; next } for my $second (0 .. $#candidates) { next if $second == $first; next if grep { (grep {$_ == $first} @$_) and (grep {$_ == $second} @$_)} @used; my $sign = join "", sort split //, $candidates[$first].$candidates[$second]; if ($sign eq $signature) { print " [" . $words{$candidates[$first]}. "] (" . $words{$candidates[$second]}. ") \n"; push @used, [$first, $second, -1]; next; } if ($items <= $inner_limit) { for my $third (0.. $#candidates) { next if $third == $second; next if grep { (grep {$_ == $first} @$_ ) and (grep {$_ == $second} @$_) and (grep {$_ == $third} @$_) } @used; my $sign = join "", sort split //, $candidates[$first] .$candidates[$second].$candidates[$third]; if ($sign eq $signature) { print " [" . $words{$candidates[$first]}. "] (" . $words{$candidates[$second]}. ") <" . $words{$candidates[$third]}. "> \n"; push @used, [$first, $second,$third]; next; } } } } }
Of course, this program is going to be much slower than the normal anagram script. For 2000 candidate words -- i.e. the words found to be composed of input-phrase letters only --, it could run (depending on your conputer's speed) for 10 minutes! That's why I put some safeguards. If the candidates are more than 2000 it won't start at all. If they are more than 100, it will limit to 2 passes. The number of iterations rises very fast, and therefore, unless you have a FAST computer with LOTS of memory (and time and patience on your side), don't change these defaults. Three passes a list of 100 candidate words result in 1 million iterations, 8 million for 200, and 27 million for 300
This is just a shot. I am sure that there is room for improvements. Maybe some saints in the Monastery could help ... :-)
 _  _ _  _  
(_|| | |(_|><

In reply to Multiple words anagrams (challenge) Re: Perl's pearls by gmax
in thread Perl's pearls by gmax

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others avoiding work at the Monastery: (4)
    As of 2020-09-20 17:32 GMT
    Find Nodes?
      Voting Booth?
      If at first I donít succeed, I Ö

      Results (122 votes). Check out past polls.