Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: detecting the language of a word?

by hardburn (Abbot)
on Dec 06, 2002 at 17:13 UTC ( #218102=note: print w/ replies, xml ) Need Help??

in reply to detecting the language of a word?

All I can say is: Good Luck. There are probably enough words in different languages that are spelled exactly the same, but have vastly different meanings and pronuciation, that you'll have a noticabily high rate of error. If you're trying to get the language of an entire document (assuming the language wasn't explicitly set in a META tag or something), you might be able to take lots of words within the text and center in on a single language. Trying to get a single word is probably a lot harder.

You might be able to center in on a language based on the character set being used. Certain languages (particularly scandinavian languages) tend to have a specific character that no one else has. Asian languages also have completely different glyphs from each other.

Comment on Re: detecting the language of a word?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://218102]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2015-10-04 04:03 GMT
Find Nodes?
    Voting Booth?

    Does Humor Belong in Programming?

    Results (98 votes), past polls