Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^4: Creating new character classes for foreign languages

by Polyglot (Pilgrim)
on May 17, 2009 at 16:48 UTC ( #764530=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Creating new character classes for foreign languages
in thread Creating new character classes for foreign languages

Yes, I have looked at that Lingua::TH module. It fails to build on my system, and I have a hard enough time troubleshooting my own code, much less someone else's. The .pm file it has is only 2.2k, which amounts to a very slim algorithm for splitting Thai, as Thai is rather a complex problem when it comes to splitting. I'm actually leaning toward a lexical approach, and working on building a word list in Thai.

In fact, I encountered errors of the wrong number of arguments upon running the 'perl Makefile.PL' command, and commented about five lines in the Makefile.PL before it would run...only to see a warning that the library file referred to was not present. So I'm thinking that it was designed to accompany some additional file, possibly a word lexicon.

This is one of the reasons I'm embarking on this journey now. There is virtually nothing in CPAN for the Thai language, or for Lao either. (And I did some reading on CPAN today, having never submitted anything there before, and learned that a module's NAMESPACE is supposed to be community directed...but I know of no Thai community among Perl monks.)

My needs go beyond splitting syllables. I plan to create a program which will translate Thai to Lao. There are some specific vowels and consonants that must be transposed in the exchange. Syllable splitting is a beginning, but only a part of the process. These tools I am packaging would be useful for many other purposes as well.

Blessings,

~Polyglot~


Comment on Re^4: Creating new character classes for foreign languages
Re^5: Creating new character classes for foreign languages
by graff (Chancellor) on May 17, 2009 at 17:45 UTC
    So I'm thinking that it was designed to accompany some additional file, possibly a word lexicon.

    Yes, that module is clearly intended to serve only as a wrapper around a separate compiled software library (not written in perl), provided here: http://thaiwordseg.sourceforge.net/.

    You have to install that library first (which will probably involve a simple sequence like ./configure; make; make install), and then try installing the perl module, which should include some tests that confirm whether the library was found and turns out to work as intended.

Re^5: Creating new character classes for foreign languages
by jgamble (Pilgrim) on May 17, 2009 at 17:45 UTC

    I have nothing to add here, but I want to say that this is a fascinating thread and I want to thank you for starting it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://764530]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2015-07-04 02:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (57 votes), past polls