http://www.perlmonks.org?node_id=84311

Well, I have a first release of my rhyme generator done. The dictionary module can be found here. I would welcome any comments on style, efficiency, room for improvement, etc.
package Lingua::EN::Rhyme; use strict; use Lingua::EN::Rhyme::Dictionary qw/%dict %reversedict/; require Exporter; our @ISA =qw/Exporter/; our @EXPORT=qw/variants pronounce syllables accent endrhyme beginrhyme + visualrhyme/; our @EXPORT_OK=qw/endsylrhyme beginsylrhyme/; our %EXPORT_TAGS= ( default => [qw(variants pronounce syllables accent + endrhyme beginrhyme visualrhyme)]); our $VERSION=0.01; sub pronounce { my $word=$_[0]; $word.="($_[1])" if defined ($_[1]) and $_[1]>1; return defined($dict{uc($word)})? $dict{uc($word)} : "** unknown wor +d **"; } sub syllables { my $pron=pronounce(uc($_[0]),$_[1]); $_=$pron; my @match=/([012])/g; my $rv=@match; return ($rv) if !($pron=~/unknown word/); } sub accent { my $pron=pronounce(uc($_[0]),$_[1]); $_=$pron; my @match=/([012])/g; return wantarray()?@match:join('',@match); } sub sylpron { $_=$_[0]; my @match=/([012])/g; my $rv=@match; return $rv; } sub beginrhyme { my ($word,$variant,$syl) = @_; my $pron=pronounce(uc($word),$variant); $pron=$word if ($word =~ / /); my $syllab=sylpron($pron); $syl=$syllab if (!defined($syl))||($syl>$syllab); my @result=(); for (my $i=$syl; $i>0; $i--) { @result = beginsylrhyme ($word,$variant,$i); if (@result) { return wantarray()? @result: $result[int(rand(@result))]; } } return wantarray()?():""; } sub variants { my $word=uc(shift); my $answer=2; return 0 if !defined($dict{$word}); while (defined($dict{"${word}($answer)"})) { $answer++ } $answer--; return $answer; } sub visualrhyme { my $word=uc(shift); my $letters=shift; $letters=length($word)-1 if (!defined($letters)); my @results=(); for (my $i=$letters; $i>0; $i--) { foreach (keys %dict) { push (@results,$_) if length($_)>$i and (substr($word,-$i) eq s +ubstr($_,-$i)) and $word ne $_ and $_ !~ /\(\d\)/; } if (@results) { return wantarray()?@results:$results[int(rand(@results))]; } } return wantarray()?():""; } sub endrhyme { my ($word,$variant,$syl) = @_; my $pron=pronounce(uc($word),$variant); $pron=$word if ($word =~ / /); my $syllab=sylpron($pron); $syl=$syllab if (!defined($syl))||($syl>$syllab); my @result=(); for (my $i=$syl; $i>0; $i--) { @result = endsylrhyme($word,$variant,$i) ; if (@result) { return wantarray()? @result : $result[int(rand(@result))]; } } return wantarray()? (): ""; } sub endsylrhyme { my ($word,$variant,$syl) = @_; $word=uc($word); my $pron = pronounce ($word,$variant); $pron=$word if ($word=~/ /); if ($pron=~ /^\*\*/) { return wantarray()? ():""; } $pron =~ /\b(\w+\d)\b/; $pron = substr($pron,$-[0]); my @resultarray=(); my $syllab=sylpron($pron); while (defined($syl) and $syl<$syllab and $syl>0) { $pron =~ /\b(\w+\d)\b/; #skip a vowel $pron = substr($pron,$+[0]) if defined($+[0]); $pron =~ /\b(\w+\d)\b/; #strip consonants in front of it $pron = substr($pron,$-[0]) if defined($-[0]); $syllab = sylpron($pron); } foreach (keys %reversedict) { push(@resultarray,$reversedict{$_}) if /$pron$/ and $reversedict{$ +_}!~/^$word(\(\d\))?$/; } if (@resultarray) { return wantarray()? @resultarray : $resultarray[int(rand(@resultar +ray))]; } else { return wantarray()? ():""; } } sub beginsylrhyme { my ($word,$variant,$syl) = @_; $word=uc($word); my $pron = pronounce ($word,$variant); $pron=$word if ($word=~/ /); if ($pron=~ /^\*\*/) { return wantarray()? ():""; } $pron = reverse $pron; $pron =~ /\b(\d\w+)\b/; $pron = substr($pron,$-[0]); my @resultarray=(); my $syllab=syllables($word,$variant); while (defined($syl) and $syl<$syllab and $syl>0) { $pron =~ /\b(\d\w+)\b/; #skip a vowel $pron = substr($pron,$+[0]) if defined($+[0]); $pron =~ /\b(\d\w+)\b/; #strip consonants in front of it $pron = substr($pron,$-[0]) if defined($-[0]); $syllab = sylpron($pron); } $pron=reverse $pron; foreach (keys %reversedict) { push(@resultarray,$reversedict{$_}) if /^$pron/ and $reversedict{$ +_}!~/^$word(\(\d\))?$/; } if (@resultarray) { return wantarray()? @resultarray : $resultarray[int(rand(@resultar +ray))]; } else { return wantarray()? ():""; } } =head1 NAME Lingua::EN::Rhyme - Finds rhymes for English words. =head1 SYNOPSIS use Lingua::EN::Rhyme; my $rhyme=endrhyme('orange'); my @rhymelist=endrhyme('orange'); $rhyme=beginrhyme('project',2,1); #Pronunciation 2, one syllable o +nly my $accentuation=accent('abortionist'); my $pronunciation=pronounce('project',2); =head1 DESCRIPTION To the joy of would-be poets everywhere, this module seeks to ease the + load of finding the perfect rhyme. The dictionary used is the freely distributable CMU Pronouncing dictionary, and is contained in the modu +le Lingua::EN::Rhyme::Dictionary. =head2 Default Export C<endrhyme> - You must specify a word, and optionally the number of th +e variant desired and the maximum number of syllables to match. Given these para +meters, a list of the "best" matches will be created. If called in array conte +xt, this array is returned, while in scalar context a random entry from th +e list is given. You may optionally provide a phonetic transcription followin +g the CMU style instead of a word. In this case, the value of the variant wo +uld be ignored. C<beginrhyme> - Usage is the same as endrhyme, but matches the beginni +ng of the words. Here "silver" would be a rhyme for "sylvan", for instance. C<visualrhyme> - Looks for words having the same ending letters as the + word provided. You may optionally provide a maximum number of letters +to match. Here you do not specify a variant, because we are basing this on spell +ing, not pronunciation. C<pronounce> - Returns the pronunciation of the word. You may optional +ly provide the number of the variant. C<variants> - Returns the number of variants in the dictionary for the + word provided. C<accent> - Returns either an array or a string containing the accentu +ation values of the word (and optionally variant) values provided. Here, 0 m +eans unaccented, 1 is primary stress, and 2 is secondary stress. C<syllables> - Returns the number of syllables in the word (and option +ally variant). =head2 Optional Exports The following two routines are used internally by Lingua::EN::Rhyme, b +ut may be exported for use in the calling program if desired. C<endsylrhyme> - Here you must specify word, variant, and number of sy +llables. Returns word(s) that rhyme in EXACTLY the number of syllables requeste +d. C<beginsylrhyme> - The same, for beginning rhymes. =head1 HISTORY Revision 0.01 2001/05/28 Mark Polo Initial revision =head1 COPYRIGHT Copyright 2001 by Mark Polo =head1 LICENSE Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the "Softwar +e"), to deal in the Software without restriction, including without limitat +ion the rights to use, copy, modify, merge, publish, distribute, sublicens +e, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be include +d in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRES +S OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILIT +Y, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHAL +L THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. =cut 1;

Replies are listed 'Best First'.
Re: Rhyme generator
by Brovnik (Hermit) on May 31, 2001 at 03:29 UTC
    I just had some fun playing with this.

    A couple of comments :

    1. The code to read in the dictionary is a bit inefficient. The folloing in Dictionary.pm speeds things up by removing surplus split/join code :
      our %dict; our %reversedict; sub read_dict { while (<DATA>) { chomp; if (/^([^\s]+)\s\s(.*)$/) { my ($key,$value) = ($1,$2); $dict{$key} = $value; $reversedict{$value} = $key; } } return; }
      This speeds up the initial read from 10+ seconds to 6.5 seconds on my machine.
    2. I also played with using Storable to speed up subsequent accesses.
      my $dict = "/tmp/Lingua::EN::Rhyme::Dictionary::DICT"; my $reversedict = "/tmp/Lingua::EN::Rhyme::Dictionary::REVDICT"; my $loaded = retrieve_dict(); sub store_dict { read_dict(); store(\%dict,$dict); store(\%reversedict,$reversedict); } sub retrieve_dict { my $ref = retrieve("DICT"); if ($ref) { %dict = %$ref; $ref = retrieve("REVDICT"); %reversedict = %$ref; } else { # Read in normal way and store read_dict(); store_dict(); } return 1; }
      This uses Storable to store and retrieve the dictionaries. It saves some time, but not a lot and leves the extra files around.

      I also started playing with storing this in a mysql DB, but haven't got that clean yet (it really needs a rewrite of all of Rhyme.pl.

    3. Hope that is of help.
      --
      Brovnik