in reply to (Golf) Cryptographer's Tool #1

My original best was 63 characters:
sub canonical { my(@a,$i,%c,$c)=pop=~/./g;$c.=$c{$_}||=$a[$i++]for pop=~/./g;$c }
(which is strict compliant, but only because at that point it wasn't any shorter being non-strict compliant. :)

However, here's a 46 character solution building off of btrott's solutions:

sub canonical { ($_,$a,%h)=@_;s!.!$h{$&}||=($a=~/./g,$&)!ge;$_ }
Can be called multiple times, of course, as in the example.


Update: Well, tilly's been mumbling something about following the spec :) , so here's a solution that accepts any ASCII character (including zero and linefeed) in the word or the alphabet:
sub canonical { ($_,$a,%h)=@_;s!.!{$h{$&}=~s+^\z+$a=~/./gs,$&+ge}$h{$&}!gse;$_ }
At 62, it actually beats my original solution by 1 character!

Replies are listed 'Best First'.
Re: Re: (Golf) Cryptographer's Tool #1
by MeowChow (Vicar) on Jun 20, 2001 at 03:27 UTC
    Following the spec at 61:
    sub c { ($_,$a,%h)=@_;join'',map{substr$a,($h{$_}||=keys%h)-1,1}/./gs }
    ... and also in spiritus strictus at 62:
    sub c { my%h;join'',map{substr$_[1],($h{$_}||=keys%h)-1,1}$_[0]=~/./gs }
                   s aamecha.s a..a\u$&owag.print
      Oo, nice! And with inspiration once again from btrott's substitution approach:
      sub canonical { ($_,$a,%h)=@_;s/./substr"a$a",$h{$&}||=keys%h,1/gse;$_ }
      Back down to 54!
Re: Re: (Golf) Cryptographer's Tool #1
by srawls (Friar) on Jun 20, 2001 at 02:17 UTC
    That doesn't work when 0 is in the 'alphabet.' Here's a modified solution that handles 0 properly, which is only 8 more char than yours: It adds the null character to each string, which shouldn't print on most machines (it does on my dos, not on my unix).
    sub c { ($_,$a)=@_;s!.!$h{$&}||=($a=~/./g,"$&\0")!ge;y/\0//;$_ }

    Update:Fixed code as per chipmunk's comment. It's 54 chars now.

    The 15 year old, freshman programmer,
    Stephen Rawls

      Adding extra characters to the result is not a valid solution. They may be invisible when you print them out, but they're still there.
      nice, srawls... here it is down to 51:
      sub c{ ($_,$a)=@_;s!.!chr($h{$&}||=($a=~/./g,ord$&))!ge;$_ #23456789_123456789_123456789_123456789_123456789_1 }

      It doesn't work with null either, but as i read it, thats fine.

      update: oops, here's one that resets %h and is reusable, as per the given example, at 54:
      sub c{ ($_,$a,%h)=@_;s!.!chr($h{$&}||=($a=~/./g,ord$&))!ge;$_ #23456789_123456789_123456789_123456789_123456789_1234 }
      The fix fails if a null character is present in the alphabet.
                     s aamecha.s a..a\u$&owag.print
        Yes it does, but the this is straight from the specification:

        Given: a word, and an 'alphabet' string (and to be exact about this latter part, each character in the word and the alphabet can be represented in 7 bits, eg the printable ASCII set).

        The 15 year old, freshman programmer,
        Stephen Rawls