Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: How to Identify a language

by florg (Friar)
on Sep 19, 2006 at 06:11 UTC ( [id://573640]=note: print w/replies, xml ) Need Help??


in reply to How to Identify a language

If you just want a program to classify text you might also be interested in: TextCat.

It's a Perl script that uses "N-Gram-Based Text Categorization" and has worked for me in the past. Though I did not need to classify Asian languages, it's supposed to support CJK.

A list of languages and an article discussing the approach can be found on the page as well.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://573640]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-28 11:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found