Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
more useful options
 
PerlMonks  

Adventures in multilingualism

by revdiablo (Prior)
on May 04, 2005 at 06:42 UTC ( #453821=perlmeditation: print w/ replies, xml ) Need Help??

A few weeks ago, someone I know was trying to solve a somewhat interesting problem (interesting to me, at least). They wanted to know all the words that could be formed from all the combinations of the last 4 digits of a phone number. We were assuming each digit has 3 letters (no Q or Z). So, for each combination of 2..9 choose 4 digits (0 and 1 don't have any letters on them), there are combinations of 0..2 choose 4 letters. These assumptions do not hold perfectly, as some phones have Q and Z on them, and the situation is likely much more complex outside of the US, but they will work for the sake of the problem.

The naive solution is to nest as many for loops as required. This is a pain, and hardly elegant. I wanted to come up with something better. I searched around the Monastery, looked at a few combinations generators (it seems everyone has their own favorite method), and came up with something that fits nicely in my brainspace.

The most elegant algorithms count up in base N (where N is the length of the list), and map that representation back to the original list. Since counting in base N wasn't the natural way for me to think about the problem, I decided to count in decimal and then convert it to the base N. TMTOWTDI, and all that. :-)

I had been writing all this code in Perl, but the Other Guy was using Python. I thought, "hey, I'm flexible. I'll just port it from Perl to Python." It was a surprisingly simple port. Perhaps the code doesn't look perfectly Pythonic, but it works as advertised, and made me happy.

Then I started playing with Pugs, and thought, "hey, this is fun. I'll port it from Perl to Perl 6." Of course, in doing this I uncovered a few bugs in Pugs, and ended up writing and committing failing test cases for those bugs (which were apparently fixed 2 days later, as the tests stopped failing and the code started working as advertised).

At the end of this adventure, I decided to share it with the Monks. So here it is, and for those who are dying to see, here's all three versions of the code. First, the original Perl 5 implementation:

#!/usr/bin/perl use strict; use warnings; my %digit_letters = ( 2 => [qw(a b c)], 3 => [qw(d e f)], 4 => [qw(g h i)], 5 => [qw(j k l)], 6 => [qw(m n o)], 7 => [qw(p r s)], 8 => [qw(t u v)], 9 => [qw(w x y)], ); my @letter_combinations; my $letterchooser = choose([0 .. 2], 4); while (my $letters = $letterchooser->()) { push @letter_combinations, $letters; } my $digitchooser = choose([2 .. 9], 4); while (my $digits = $digitchooser->()) { for my $letters (@letter_combinations) { my @digits = split //, $digits; my @letters = split //, $letters; my @word = map { $digit_letters{$digits[$_]}[$letters[$_]] } 0 .. $#digits; print "$digits: ", @word, "\n"; } } sub basen { my ($base, $num) = @_; my $q = int($num / $base); my $r = $num % $base; return $r if $q == 0; return basen($base, $q), $r; } sub choose { my ($list, $number) = @_; my $listcount = @$list; my $combcount = @$list**$number; my $curr = 0; sub { return if $curr >= $combcount; my @choice = basen($listcount, $curr++); unshift @choice, 0 while @choice < $number; return join "", map $list->[$_], @choice; } }

Next, Python (the output is slightly different, but I didn't feel like wrangling it into shape):

#!/usr/bin/python def basen (base, num): q = num / base r = num % base if q == 0: return [r] else: return basen(base, q) + [r] def choose (list, number): iterations = len(list)**number for i in range(0, iterations): choice = basen(len(list), i) while len(choice) < number: choice.insert(0, 0) yield [ list[x] for x in choice ] digit_letters = { 2: [ 'a', 'b', 'c' ], 3: [ 'd', 'e', 'f' ], 4: [ 'g', 'h', 'i' ], 5: [ 'j', 'k', 'l' ], 6: [ 'm', 'n', 'o' ], 7: [ 'p', 'r', 's' ], 8: [ 't', 'u', 'v' ], 9: [ 'w', 'x', 'y' ] } letter_choices = [] for letters in choose([0,1,2], 4): letter_choices.append(letters) for digits in choose(range(2,10), 4): for letters in letter_choices: word = [] for i in range(0, len(digits)): digit = digits[i] letter_i = letters[i] letter = digit_letters[digit][letter_i] word.append(letter) print digits, ":", ''.join(word)

And finally, the Perl 6 version:

#!/usr/bin/pugs my %digit_letters = ( 2 => [qw(a b c)], 3 => [qw(d e f)], 4 => [qw(g h i)], 5 => [qw(j k l)], 6 => [qw(m n o)], 7 => [qw(p r s)], 8 => [qw(t u v)], 9 => [qw(w x y)], ); my @letterchoices; my $letters; my $letterchooser = choose([0 .. 2], 4); while $letters = $letterchooser() { push @letterchoices, $letters; } my $digits; my $digitchooser = choose([2 .. 9], 4); while $digits = $digitchooser() { my $letters; for @letterchoices -> $letters { my @digits = split '', $digits; my @letters = split '', $letters; my @word; for zip(@digits, @letters) -> $digit, $letter { push @word, %digit_letters{$digit}[$letter]; } say "$digits: ", @word; } } sub basen ($base, $num) { my $q = int($num / $base); my $r = $num % $base; return $r if $q == 0; return basen($base, $q), $r; } sub choose ($list, $number) { my $iterations = $list.elems ** $number; my $current = 0; return sub { return if $current >= $iterations; my @choice = basen($list.elems, $current++); unshift @choice, 0 while @choice.elems < $number; return @choice.map({$list[$_]}).join(""); }; }

There are a few things I got from this experience. One thing that really stood out was how similar the solutions were. Perhaps that was a consequence of the fact that I was porting the code from one to the next, rather than starting all over, but it shows that the facilities used here are available in all three languages.

Also, I noticed that the Perl 6 version still looks very much like Perl. It just has a few changes, and I think they're for the better. Perhaps the code can be further P6-ified, but I don't think it will change the feel dramatically. I think the new language is going to turn out nicely.

Finally, I noticed that -- yes, I am master of the obvious -- Python does some things that Perl doesn't. The big standout here was the built in iterator support. Just using yield instead of return is a heck of a lot easier than using an anonymous sub that wrangles closures to provide the same functionality. It's not that I would want to get rid of the closures (which seems to be the Python solution), but I'd like to have the easier technique too. Perl is supposed to be the language that steals good ideas from all around, isn't it?

As always, code criticisms/critiques welcome. I'm especially interested in any Perl6ish improvements to that version. I tend to limit myself to things that work in Pugs, but I know there are others here who don't have that limitation, and I'd love to hear from them. Anything else welcome, too! Please reply early and often. :-)

Comment on Adventures in multilingualism
Select or Download Code
Re: Adventures in multilingualism
by tilly (Archbishop) on May 04, 2005 at 08:07 UTC
    Isn't Perl 6 supposed to have support for continuations, including a built-in yield operator?

      Perl6 is supposed to have lots of features; but it's supposed to be simple (unlike Java) in that if you don't want to use these advanced features (like for example continuations, coroutines with yield, exceptions, quantum-superpositions, multimethods, strong typing, operator overloading, macros, calling parrot or native code, or even just objects), you don't have to use any of them. You should still be able to write one-liners or (basic|fortran|pascal)-like programs in it. (Well, this is the theory, I don't know how much perl6 will actually be like this, I'm still a bit affraid it will not be.)

        ambrus,
        I don't believe tilly was criticizing revdiablo for not using the features. He was addressing revdiablo's comments that came after the code about Perl stealing good ideas from other languages. I myself have found that porting p5 code to p6 is quite easy and it looks much like it did before. That doesn't mean the same thing written from scratch in p6 would look similar once you take advantage of the NWTDI (new ways to do it).

        Cheers - L~R

      tilly,
      revdiablo did indicate that he is limiting his code to features Pugs currently supports. He may not be aware that coroutines will be supported in p6 because [AES]17 hasn't been written yet. See Perl6 Timeline By Apocalypse for more details.

      Cheers - L~R

Re: Adventures in multilingualism
by Anonymous Monk on May 04, 2005 at 08:51 UTC
    I'd just write a frontend for grep, and let it do the hard work:
    $ARGV[0]=~s{([2-9])} {'['.substr("abcdefghijklmnoprstuvwxy",3*($1-2),3).']'}eg; exec 'grep', "^$ARGV[0]\$", '/usr/share/dict/words';
    That's what I call 'multilingualism' ;-)
Re: Adventures in multilingualism
by Roy Johnson (Monsignor) on May 04, 2005 at 12:02 UTC
    I felt compelled to offer a version using glob to generate the combinations. It might not port so well to Python (but I don't really know what Python has).
    #!perl use strict; use warnings; my %digit_letters = ( 2 => '{a,b,c}', 3 => '{d,e,f}', 4 => '{g,h,i}', 5 => '{j,k,l}', 6 => '{m,n,o}', 7 => '{p,r,s}', 8 => '{t,u,v}', 9 => '{w,x,y}', ); my $digit_glob = '{' . join(',', 2..9) . '}'; my $limit = 20; for my $num (glob($digit_glob x 4)) { (my $word_glob = $num) =~ s/(.)/$digit_letters{$1}/g; for my $word (glob $word_glob) { print "$num: $word\n"; } }

    Caution: Contents may have been coded under pressure.
Re: Adventures in multilingualism
by inman (Curate) on May 04, 2005 at 17:11 UTC
    Brute forcing it a little but you can count from 2222 to 9999, reject the numbers that contain 1 or 0 and print out the combinations for the rest. The combinations function avoid hardcoded loops by recursing through the data. E.g. you can add q and z in.
Re: Adventures in multilingualism
by bluto (Curate) on May 04, 2005 at 17:48 UTC
    A few weeks ago, someone I know was trying to solve a somewhat interesting problem (interesting to me, at least). They wanted to know all the words that could be formed from all the combinations of the last 4 digits of a phone number.

    When I first read this, I thought you meant actual words, in a dictionary, but it sounds like you mean combinations. Just an aside, if someone wants to limit this to actual words, it's pretty trivial with something like...

    perl -lne 'if (/^[a-p,r-y]{4}$/i) {($a = lc $_) =~ tr/[a-p,r-y]/222333 +444555666777888999/; print "$_: $a"}' < /usr/share/dict/words

    If the dictionary has more than one entry for the same word (i.e. capitalized and not), then you'll see it more than once, so putting things in a hash might help.

      When I first read this, I thought you meant actual words, in a dictionary, but it sounds like you mean combinations.

      Well, generating words was the original goal. Combinations are a step on the way to that goal. Once the combinations are generated, grepping for real words (e.g. using /usr/share/dict/words) is easy. But you have flipped the algorithm around, and used the words to generate the numbers, which is actually quite nice. I hadn't thought about it that way. Many thanks for the reply!

        This will work for well known words, the problem with this method is that there are many four letter combinations that are not necessarily English words but will be contractions of words or acronyms: MUFC => 'Manchester United Football Club', PLNE => 'plane'. There is also L33T which bends the rules by using numbers. Just call 0800 123-SK8R for your local half pipe...
Re: Adventures in multilingualism
by jackdied (Monk) on May 05, 2005 at 21:19 UTC
    What you want is a cartesian product of 2..9, and then a cartesian product of each of the letters allowed for each of the numbers. I've been out of the perl game for a while but I can comment on how I'd do it in python. The following will print 81 combinations of letters (3*3*3*3) for each possible four digit group of numbers.
    # digit_letters is defined as in the original def go(): import probstat # module for combinatorics valid_numbers = range(2,10) # 2..9 inclusive for (number_set) in probstat.Cartesian([valid_numbers]*4): answers = [] letter_sets = probstat.Cartesian([digit_letters[number] for number + in number_set]) for (letter_set) in letter_sets: answers.append(''.join(letter_set)) # combine the four letters print number_set, ", ".join(answers)
    I tried to break it down into shorter lines than I normally use, but all it is doing in pseudo code is
    for (four_numbers) in Carteisian([2..9]*[2..9]*[2..9]*[2..9]): print four_numbers for (four_letters) in Cartesian(the letters for four_numbers): print four_letters
Re: Adventures in multilingualism
by Adrade (Pilgrim) on May 10, 2005 at 01:08 UTC
    I thought I'd have some fun with this one:
    use Math::BaseCnv;@n=split(//,<STDIN>);@l=(a..p,r..z);for(1..3**$#n){ my@q=split(//,cnv($t++,10,3));for(0..$#n-1){ print$l[($n[$_]-2)*3+$q[$#n-$_]]}print"\n";}
    Just type in whatever numbers you want letters for after you run it. I cheated though - I used the base converter to do the hard work :-) Gotta make sure you have that mod installed.

    Not really all that much of a contrib to the convers, but I thought I'd share my afternoon fun - as you said: tmtowtdi.

    -Adam

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://453821]
Approved by johnnywang
Front-paged by johnnywang
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2014-04-20 03:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls