I was looking for an Anagram module on CPAN: Something that
could read in a dictionary and use that to create simple
anagrams of words with an
anagram($word) function. I
couldn't find anything, so I wrote my own module. I wanted
to submit it to CPAN, but I've never done that before. My
questions are two-fold:
#1: What should I do before submitting the module? I'd like
to set up something that works with
MakeMaker (someone
suggested
h2xs). What about general documentation?
Clearly there is no end to the amount of support you could
give a module, but what's a reasonable lower bound for
supporting this hack?
#2: I've attached the module code and a short script to run
it below. What is stylistically expected from a CPAN
module? How could this code get cleaned up? Are there any
algorithmic changes that would make the code run faster
(or support more than 1 destination word?) (If you're curious, the code parses a
dictionary into a standard tree format and then when anagraming
the words, walks through the tree on a prefix-by-prefix basis.
A few of my variable names suck. Let me know if you have
better names than
$sub_table and
$sub_prefix.)
anagram:
#!/usr/bin/perl -w
use strict;
use Anagram;
while (<>) {
chomp;
my $word = $_;
s/\W//g;
my $base = lc;
my @words = grep { $base ne $_ } Anagram::anagram($base);
next unless @words;
print "\nAnagrams of $_:\n";
print " $_\n" foreach @words;
}
Anagram.pm:
use strict;
use Carp;
package Anagram;
@Anagram::EXPORT = qw(anagram is_word starts_word);
BEGIN {
$Anagram::version = "1.0";
$Anagram::dict = {};
open DICT, "</usr/dict/words" or croak("Could not open dictionary:
+ $!");
foreach (<DICT>) {
s/\W//g;
my @letters = split //, lc;
# Insert word into sorted dictionary tree
my $sub_table = $Anagram::dict;
foreach (@letters) {
# Create sub-table if it's missing and then recurse
$sub_table->{$_} = {} if ref($sub_table->{$_}) ne 'HASH';
$sub_table = $sub_table->{$_};
}
# Flag this entry as a valid word
$sub_table->{word} = 1;
}
close DICT;
}
sub version {
my $version = shift;
warn "Version $version is later than $Anagram::version."
if defined($version) && ($version > $Anagram::version);
return $Anagram::version;
}
sub anagram {
my $word = shift;
# Create table of letters. This is better than a pure table
# because we avoid extra work for duplicated letters-- we can
# anagram "aaaaa" in one cycle instead of 5! = 120 cycles.
my $letters = {};
foreach (split //, $word) {
$letters->{$_}++;
}
return permute_recurse($letters, "");
}
sub permute_recurse {
my ($letters, $prefix) = @_;
my @words = ();
foreach (keys %$letters) {
# Test if any words start with this new prefix
my $sub_prefix = $prefix.$_;
my $sub_table = starts_word($sub_prefix);
next unless $sub_table;
# Use this letter
if ($letters->{$_} <= 1) {
delete $letters->{$_};
} else {
$letters->{$_}--;
}
# Test for recurse case
if (scalar keys %$letters) {
push @words, permute_recurse($letters, $sub_prefix);
}
# Test for base case
elsif ($sub_table->{word}) {
push @words, $sub_prefix;
}
# Restore letter
$letters->{$_}++;
}
return @words;
}
sub is_word {
my $sub_table = walk_dict(@_) or return undef;
return $sub_table->{word};
}
# Returns a sub-table entry on true and undef on failutre
sub starts_word {
my ($word) = @_;
my $sub_table = $Anagram::dict;
foreach (split //, $word) {
$sub_table = $sub_table->{$_};
return undef unless ref($sub_table) eq 'HASH';
}
return $sub_table;
}
1;
-Ted
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.