Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I was looking for an Anagram module on CPAN: Something that could read in a dictionary and use that to create simple anagrams of words with an anagram($word) function. I couldn't find anything, so I wrote my own module. I wanted to submit it to CPAN, but I've never done that before. My questions are two-fold:

#1: What should I do before submitting the module? I'd like to set up something that works with MakeMaker (someone suggested h2xs). What about general documentation? Clearly there is no end to the amount of support you could give a module, but what's a reasonable lower bound for supporting this hack?

#2: I've attached the module code and a short script to run it below. What is stylistically expected from a CPAN module? How could this code get cleaned up? Are there any algorithmic changes that would make the code run faster (or support more than 1 destination word?) (If you're curious, the code parses a dictionary into a standard tree format and then when anagraming the words, walks through the tree on a prefix-by-prefix basis. A few of my variable names suck. Let me know if you have better names than $sub_table and $sub_prefix.)

#!/usr/bin/perl -w use strict; use Anagram; while (<>) { chomp; my $word = $_; s/\W//g; my $base = lc; my @words = grep { $base ne $_ } Anagram::anagram($base); next unless @words; print "\nAnagrams of $_:\n"; print " $_\n" foreach @words; }
use strict; use Carp; package Anagram; @Anagram::EXPORT = qw(anagram is_word starts_word); BEGIN { $Anagram::version = "1.0"; $Anagram::dict = {}; open DICT, "</usr/dict/words" or croak("Could not open dictionary: + $!"); foreach (<DICT>) { s/\W//g; my @letters = split //, lc; # Insert word into sorted dictionary tree my $sub_table = $Anagram::dict; foreach (@letters) { # Create sub-table if it's missing and then recurse $sub_table->{$_} = {} if ref($sub_table->{$_}) ne 'HASH'; $sub_table = $sub_table->{$_}; } # Flag this entry as a valid word $sub_table->{word} = 1; } close DICT; } sub version { my $version = shift; warn "Version $version is later than $Anagram::version." if defined($version) && ($version > $Anagram::version); return $Anagram::version; } sub anagram { my $word = shift; # Create table of letters. This is better than a pure table # because we avoid extra work for duplicated letters-- we can # anagram "aaaaa" in one cycle instead of 5! = 120 cycles. my $letters = {}; foreach (split //, $word) { $letters->{$_}++; } return permute_recurse($letters, ""); } sub permute_recurse { my ($letters, $prefix) = @_; my @words = (); foreach (keys %$letters) { # Test if any words start with this new prefix my $sub_prefix = $prefix.$_; my $sub_table = starts_word($sub_prefix); next unless $sub_table; # Use this letter if ($letters->{$_} <= 1) { delete $letters->{$_}; } else { $letters->{$_}--; } # Test for recurse case if (scalar keys %$letters) { push @words, permute_recurse($letters, $sub_prefix); } # Test for base case elsif ($sub_table->{word}) { push @words, $sub_prefix; } # Restore letter $letters->{$_}++; } return @words; } sub is_word { my $sub_table = walk_dict(@_) or return undef; return $sub_table->{word}; } # Returns a sub-table entry on true and undef on failutre sub starts_word { my ($word) = @_; my $sub_table = $Anagram::dict; foreach (split //, $word) { $sub_table = $sub_table->{$_}; return undef unless ref($sub_table) eq 'HASH'; } return $sub_table; } 1;


In reply to Module Submission for CPAN by tedv

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2023-06-07 19:29 GMT
Find Nodes?
    Voting Booth?
    How often do you go to conferences?

    Results (29 votes). Check out past polls.