Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^4: Creating new character classes for foreign languages

by Polyglot (Monk)
on May 17, 2009 at 16:48 UTC ( #764530=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Creating new character classes for foreign languages
in thread Creating new character classes for foreign languages

Yes, I have looked at that Lingua::TH module. It fails to build on my system, and I have a hard enough time troubleshooting my own code, much less someone else's. The .pm file it has is only 2.2k, which amounts to a very slim algorithm for splitting Thai, as Thai is rather a complex problem when it comes to splitting. I'm actually leaning toward a lexical approach, and working on building a word list in Thai.

In fact, I encountered errors of the wrong number of arguments upon running the 'perl Makefile.PL' command, and commented about five lines in the Makefile.PL before it would run...only to see a warning that the library file referred to was not present. So I'm thinking that it was designed to accompany some additional file, possibly a word lexicon.

This is one of the reasons I'm embarking on this journey now. There is virtually nothing in CPAN for the Thai language, or for Lao either. (And I did some reading on CPAN today, having never submitted anything there before, and learned that a module's NAMESPACE is supposed to be community directed...but I know of no Thai community among Perl monks.)

My needs go beyond splitting syllables. I plan to create a program which will translate Thai to Lao. There are some specific vowels and consonants that must be transposed in the exchange. Syllable splitting is a beginning, but only a part of the process. These tools I am packaging would be useful for many other purposes as well.

Blessings,

~Polyglot~


Comment on Re^4: Creating new character classes for foreign languages
Re^5: Creating new character classes for foreign languages
by graff (Chancellor) on May 17, 2009 at 17:45 UTC
    So I'm thinking that it was designed to accompany some additional file, possibly a word lexicon.

    Yes, that module is clearly intended to serve only as a wrapper around a separate compiled software library (not written in perl), provided here: http://thaiwordseg.sourceforge.net/.

    You have to install that library first (which will probably involve a simple sequence like ./configure; make; make install), and then try installing the perl module, which should include some tests that confirm whether the library was found and turns out to work as intended.

Re^5: Creating new character classes for foreign languages
by jgamble (Pilgrim) on May 17, 2009 at 17:45 UTC

    I have nothing to add here, but I want to say that this is a fascinating thread and I want to thank you for starting it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://764530]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (16)
As of 2014-07-23 14:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (145 votes), past polls