Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Getting impossible things right (behaviour of keys)

by PetaMem (Priest)
on Oct 24, 2001 at 13:05 UTC ( #121041=perlquestion: print w/ replies, xml ) Need Help??
PetaMem has asked for the wisdom of the Perl Monks concerning the following question:

Masters, we have a customer Database where contacts are stored with all their relevant Data, but in 1st person singular only. Now it happens that these are czech customers And if someones name is Jim Beam, you donīt just say Dear Mr. Beam, but Vazeny Mr. Beame, instead. Now theres a bunch of rules how to flex the names to this form called Vocativ. The most interesting experience was, that czech people were talking to me that it is IMPOSSIBLE to do this automatically. It took me 10 minutes of thought and 5 minutes to code to come up with something like that:
#!/usr/bin/perl -w use strict; my $name = shift; &cz_vocativ_mask($name); sub cz_vocativ_mask { my $name = shift; # Important! The longest suffixes must be the last ones in this list. my %sufdata = ( a => 'o', s => 'si', k => 'ku', ar => 'ari', ic => 'ici', ec => 'ce', ek => 'ku', er => 'ere', # vec => 'evce', vec => 'veci', # Michal Svec is saying he likes this better ); foreach my $suffix (keys %sufdata) { if($name =~ /^(.*)$suffix$/) { print "Suffix $suffix -> $1$sufdata{$suffix}\n"; return; } } print " no rule apply -> $name\n"; }
Now this works great for me and some more rules and weīre at ~100%. The biggest culprit is, that it works for me only... As written in the sourcecode, the longest suffixes MUST be the last ones, as the longest suffixes MUST be examined as the first ones at runtime.

The problem is, that the above piece of code indeed does examine the longest pieces first, but the people who applied this code in the DB-system say, the order of examination is completedly random.

So the behaviour of keys seems different, but we canīt see any difference (same perl version, same OS) the only difference is, that the code is just part of other code (but as separate routine also) and runs on a machine with less memory where it doesnīt seem to work "right".

Any suggestions how to FORCE the examination of longest keys first?

Any help greatly appreciated

Ciao

Comment on Getting impossible things right (behaviour of keys)
Download Code
Re: Getting impossible things right (behaviour of keys)
by blakem (Monsignor) on Oct 24, 2001 at 13:08 UTC
    Use a custom sort in your foreach:
    foreach my $suffix (keys %sufdata)
    becomes:
    foreach my $suffix (sort {length($b) <=> length($a)} keys %sufdata)

    -Blake

      Or even:
      foreach my $suffic( sort { length($b) <=> length($a) or $a cmp $b } keys %sufdata)
      to sort both on length and alphanumeric
        Or you could exploit the fact that the regex looks for the leftmost matching string, and do something like:
        my $pattern = join('|',(sort {$a cmp $b} keys %sufdata)); if ($name =~ s/($pattern)$/$sufdata{$1}/o) { print "Suffix $1 -> $name\n"; return; } print " no rule apply -> $name\n";

        -Blake

Re: Getting impossible things right (behaviour of keys)
by MZSanford (Curate) on Oct 24, 2001 at 13:50 UTC
    This sounds great! I only wish czech was one of the languages i had to work with. I have to say, you appear to have the newest addition to the Lingua namespace. things like this are where Perl being fast, and programmers being clever come together. Congrats!
    i had a memory leak once, and it ruined my favorite shirt.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://121041]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2014-09-24 04:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (245 votes), past polls