Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

(Ovid) RE: Perl Golf

by Ovid (Cardinal)
on Jul 23, 2000 at 23:49 UTC ( #24000=note: print w/replies, xml ) Need Help??

in reply to Perl Golf (was RE: (Ovid) RE: Pig Latin)
in thread Pig Latin

japhy, originally, I was constructing a rather longer and optimized script to do the pig latin conversion. Then I went back and reread vroom's specs. First, I didn't use the /i modifier because he said we were to assume the data was lowercase and he wanted the shortest possible code.

The reason I am using three backreferences is because the data saved to $2 is tricky. Your equivalent (ignoring the "qu" problem) is [^\W0-9_]. This allows you to match all alphabeticals but does no discrimination for vowels. However, you apeared to notice this when you mentioned [b-df-hj-np-tv-z]. Therefore, I suspect that you intended the following and (assuming you did intend this) I offer you kudos for a clever regex:

I also noticed that, in this case, using the /i modifier ignored vroom's "lowercase" spec, but does result in a shorter regex.


Replies are listed 'Best First'.
RE: (Ovid) RE: Perl Golf
by japhy (Canon) on Jul 24, 2000 at 00:00 UTC
    Oh, d'oh, I'm silly. I meant to add 'aeiou' to the character class, I really did, since that was the whole reason I introduced it. :) And I'm sorry I hadn't checked vroom's specs.

    By the way, since Pig Latin does not produce a 1-to-1 mapping of normal strings to PL-strings, you can't reasonably reverse this process. Example: flea and leaf both go to eaflay.

      "does not produce a 1-to-1 mapping"

      As children, my brothers and I used a dialect of Pig Latin that did provide 1-to-1 mapping.   If I remember right, flea translated to lea-fay, yet leaf translated to eaf-lay.

      Off the top of my head, TH was the only consonant combination that didn't break like that.   How might such discernment be added to y'all's way-clever regexes?

        Pig Latin words are not supposed to begin with consonants. flea becomes ea-flay and leaf becomes eaf-lay. If you use hyphens, you can create that distinction. If you don't want to use a visible character, though, insert a NUL. Oh, and since when does about become aboutway? I never heard of the 'insert a w' rule.
        The preceeding code allows you to decode the string. While it does effect length(), a simple (?) work-around would be:
        package PigLatin; use overload ( q{""} => sub { (my $str = ${ $_[0] }) =~ tr/\0//d; return $str; }, fallback => 1, ); sub to { my ($class,$word) = @_; $word =~ s/\b(qu|[^\W0-9_aeiou]+)?(\w+)/$1?"$2\0$1ay":"$2\0ay"/egi; bless \$word, $class; } sub from { my $word = substr ${ $_[-1] }, 0, -2; # $word has the \0 in it return join "", reverse split /\0/, $word; } 1;
        This module would be used as so:
        use PigLatin; $plain = "Practical Extraction and Report Language"; $funny = to PigLatin $plain; $reg = from PigLatin $funny;
        length($funny) would be 50, but length($$funny) is 55, due to the 5 added NULs.

        Okay, let me see if I have this right:
        • Generally, we should have a 1-to-1 mapping (flea and leaf should be distict).
        • Words beginning with "th" should have this combination moved to the end. I also assumed that words starting with "wh", "ch" and the like would exibit this behavior.
        • I also assumed that "qu" would be moved to the end (though I didn't try anything fancy with words like "qaid" and "qintar".)
        I think the following will do the trick:
        my $test = "Wherefore art thou, Ovid, you moronic chowderheaded shell +of a ghost? Give me my fleas and leaf."; $test =~ s/\b(qu|[cgpstw]h|[^\W0-9_aeiou])?([a-z]+)/$1?"$2$1ay":"$2way +"/ieg; print $test; Output (split on two lines for legibility): ereforeWhay artway outhay, Ovidway, ouyay oronicmay owderheadedchay el +lshay ofway away ostghay? iveGay emay ymay leasfay andway eaflay.
        This one is rather tricky because it relies on how Perl's regex engine works. Because Perl uses the traditional NFA engine, it takes the first successful match in the alternation and runs with it. Therefore, I need to test for possibilities like the "th" in "thistle" before I test for the "t" in "testy". Otherwise, the regex engine would grab the first "t" in words beginning with "th" and try to complete the match.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://24000]
[Corion]: erix: Heh ;) Transcribing/ writing notes is a good thing, at least for the stuff out of copyright!

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2018-06-24 07:31 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.