Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^4: Creating new character classes for foreign languages

by Polyglot (Monk)
on May 17, 2009 at 10:32 UTC ( #764497=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Creating new character classes for foreign languages
in thread Creating new character classes for foreign languages

The double 'r' ends up sounding like "un", so I suppose that, technically, the first 'r' becomes the vowel 'u' while the second converts to an 'n'. However, they are considered as a single unit, much like the 'll' or 'ch' have their own places in alphabetical order for Spanish, as if they were single letters.

Now, I've seen that the subroutines in the package file follow a specific syntax...what does a rule look like in a package file?

Also, I had a little trouble when putting my new package to use, in that the "shortcut method" in the final routine here failed, and I ended up hard-coding the code points for those characters.

sub InThaiHCons { #High-class consonants return <<'END'; 0E02 0E03 0E09 0E10 0E16 0E1C 0E1D 0E28 0E29 0E2A 0E2B END } sub InThaiMCons { #Middle-class consonants return <<'END'; 0E01 0E08 0E0E 0E0F 0E14 0E15 0E1A 0E1B 0E2D END } ################################ Low-class consonants =for NON-WORKING EXAMPLE sub InThaiLCons { #THIS DIDN'T WORK return <<'END'; +Thai::InThaiCons -Thai::InThaiHcons -Thai::InThaiMCons END } =cut sub InThaiLCons { #THIS DOES WORK return <<'END'; 0E04 0E07 0E0A 0E0D 0E11 0E13 0E17 0E19 0E1E 0E27 0E2C 0E2E END }

Why?

Thanks so much for your help!

Blessings,

~Polyglot~


Comment on Re^4: Creating new character classes for foreign languages
Download Code
[OT] Re^5: Creating new character classes for foreign languages
by salva (Abbot) on May 17, 2009 at 11:53 UTC
    However, they are considered as a single unit, much like the 'll' or 'ch' have their own places in alphabetical order for Spanish, as if they were single letters

    This was changed fifteen years ago (probably to make programmers happier). Now, officially, "ch" is sorted between "cg" and "ci" even if it is still considered a single letter and the same applies to "ll" (see http://en.wikipedia.org/wiki/Spanish_language#Writing_system).

Re^5: Creating new character classes for foreign languages
by graff (Chancellor) on May 17, 2009 at 15:47 UTC
    Regarding the thing that didn't work, did you leave something out from the code that you posted? The "non-working" definition for "InThaiLCons" makes a reference to a sub called "Thai::InThaiCons", but there is no such subroutine in the code you posted. Clearly, referring to a non-existent subroutine will lead to failure.

    Based on what you've posted, it looks like you can simply define your three subsets of consonants explicitly (including your definition of "InThaiLCons" that does work), and then create an overall "InThaiCons" sub by adding together the three subsets:

    sub InThaiCons { return <<END; +Thai::InThaiHcons +Thai::InThaiMCons +Thai::InThaiLCons END }
      Graff,

      You have sure been patient with me. Thank you. I did have another subroutine called InThaiCons which I had forgotten to show (naming all Thai consonants: 0E01 - 0E2E). However, your thought led me to scrutinize that part once again, and I discovered the error.

      s/InThaiHcons/InThaiHCons/

      Ouch...what a dull head can't do to obstruct progress, a slow finger can. I wonder how often I am delayed hours over a single character like this?

      Thank you again!

      Blessings,

      ~Polyglot~

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://764497]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2014-12-27 01:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls