http://www.perlmonks.org?node_id=1057333


in reply to Re^2: Perl & Unicode: state of the art?
in thread Perl & Unicode: state of the art?

> can the language be determined?

You know the answer, only with statistical certainty and dependent on the length of the text and the distance of languages.

Hand and finger (en) <=> Hand und Finger (de)

If same script lead to same delimiters can only be answered by someone knowing all 6000 languages of the world.

But already Arabic words should be a problem, maybe less if transcribed. Chinese even more.

see also Word_divider and Word#Word_boundaries

Cheers Rolf

( addicted to the Perl Programming Language)

  • Comment on Re^3: Perl & Unicode: state of the art?

Replies are listed 'Best First'.
Re^4: Perl & Unicode: state of the art?
by BrowserUk (Patriarch) on Oct 08, 2013 at 02:16 UTC
    You know the answer

    Nope. If I knew, I wouldn't be asking.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Well come back to Babel, brothers..

      Languages are live things, poetry is a valid form of a language.

      Processors are mechanicals things: no way to cover all the cases.

      Perl is digital and my brain is analogical.

      no hope, sorry
      there are no rules, there are no thumbs..