RE: (Ovid) RE: Perl Golf

by japhy (Canon)
on Jul 24, 2000

in reply to (Ovid) RE: Perl Golf
in thread Pig Latin

Oh, d'oh, I'm silly. I meant to add 'aeiou' to the character class, I really did, since that was the whole reason I introduced it. :) And I'm sorry I hadn't checked vroom's specs.

By the way, since Pig Latin does not produce a 1-to-1 mapping of normal strings to PL-strings, you can't reasonably reverse this process. Example: flea and leaf both go to eaflay.


Replies are listed 'Best First'.
RE: (3)Perl Golf (Pig Latin dialect)
on Jul 24, 2000
    "does not produce a 1-to-1 mapping"

    As children, my brothers and I used a dialect of Pig Latin that did provide 1-to-1 mapping.   If I remember right, flea translated to lea-fay, yet leaf translated to eaf-lay.

    Off the top of my head, TH was the only consonant combination that didn't break like that.   How might such discernment be added to y'all's way-clever regexes?

      Pig Latin words are not supposed to begin with consonants. flea becomes ea-flay and leaf becomes eaf-lay. If you use hyphens, you can create that distinction. If you don't want to use a visible character, though, insert a NUL. Oh, and since when does about become aboutway? I never heard of the 'insert a w' rule.
      The preceeding code allows you to decode the string. While it does effect length(), a simple (?) work-around would be:
      package PigLatin; use overload ( q{""} => sub { (my $str = ${ $_[0] }) =~ tr/\0//d; return $str; }, fallback => 1, ); sub to { my ($class,$word) = @_; $word =~ s/\b(qu|[^\W0-9_aeiou]+)?(\w+)/$1?"$2\0$1ay":"$2\0ay"/egi; bless \$word, $class; } sub from { my $word = substr ${ $_[-1] }, 0, -2; # $word has the \0 in it return join "", reverse split /\0/, $word; } 1;
      This module would be used as so:
      use PigLatin; $plain = "Practical Extraction and Report Language"; $funny = to PigLatin $plain; $reg = from PigLatin $funny;
      length($funny) would be 50, but length($$funny) is 55, due to the 5 added NULs.

      Okay, let me see if I have this right:
      • Generally, we should have a 1-to-1 mapping (flea and leaf should be distict).
      • Words beginning with "th" should have this combination moved to the end. I also assumed that words starting with "wh", "ch" and the like would exibit this behavior.
      • I also assumed that "qu" would be moved to the end (though I didn't try anything fancy with words like "qaid" and "qintar".)
      I think the following will do the trick:
      my $test = "Wherefore art thou, Ovid, you moronic chowderheaded shell +of a ghost? Give me my fleas and leaf."; $test =~ s/\b(qu|[cgpstw]h|[^\W0-9_aeiou])?([a-z]+)/$1?"$2$1ay":"$2way +"/ieg; print $test; Output (split on two lines for legibility): ereforeWhay artway outhay, Ovidway, ouyay oronicmay owderheadedchay el +lshay ofway away ostghay? iveGay emay ymay leasfay andway eaflay.
      This one is rather tricky because it relies on how Perl's regex engine works. Because Perl uses the traditional NFA engine, it takes the first successful match in the alternation and runs with it. Therefore, I need to test for possibilities like the "th" in "thistle" before I test for the "t" in "testy". Otherwise, the regex engine would grab the first "t" in words beginning with "th" and try to complete the match.


