Re: Weird "soundex" algorithm

by broquaint (Abbot)
on Aug 28, 2003 at 15:17 UTC ( #287405=note: print w/replies, xml ) Need Help??

in reply to Weird "soundex" algorithm

sub weirdex { local $_ = shift; tr/a-zA-Z//dc; tr/a-zA-Z//s; m/[aeiou]/g and substr($_, pos) =~ s/[aieuo]//g; $_; } print "$_ - ", weirdex($_), $/ for "giulienk", "larry wall", "etheroskedasticity"; __output__ giulienk - gilnk larry wall - larywl etheroskedasticity - ethrskdstcty
tr//, m// and s///, pos and substr are your friends :)


Re: Re: Weird "soundex" algorithm
by giulienk (Curate) on Aug 28, 2003 at 15:32 UTC
    Thanks to yours and jmcnamara's reply: i totally forgot about the power of tr/// modifiers, especially c and s, which i never used before. :)


Re: Re: Weird "soundex" algorithm
by diotalevi (Canon) on Aug 28, 2003 at 19:10 UTC

    Nice. I was unhappy with your vowel stripping though. You use a looping match for offsets into $_ but really, you only have to find the *first* vowel and then no looping is required.

    # m/[aeiou]/g and substr($_, pos) =~ s/[aieuo]//g; /[aeiou]/ and substr( $_, $+[0] ) =~ tr/aeiou//d;
      You use a looping match for offsets into $_ but really ...
      Er, what looping? The /g matches the first vowel then saves the position of the match for substr which the replace then operates on. I didn't want to use the $+ variable because of the overhead it invokes.


        Er... what overhead? You mean of making an array access because that's all it is. @+ and @- don't invoke the $`, $& and $' penalties. Those arrays are just offsets into the string. $-[0] is the offset of the beginning of the string and $+[0] is the offset of the end of the string. Using those doesn't prompt perl to do all the copying that capturing, $`, $&, and $' do. Its just not the same thing.

        Granted, I did miss that scalar /g loops only once and in this usage that'd be the only loop ever used. I find myself avoiding pos() after learning that it doesn't survive a local() on the variable in question. That's just style though.

