Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re^4: Database Search Format Engine

by Polyglot (Pilgrim)
on Jun 17, 2009 at 23:46 UTC ( #772591=note: print w/replies, xml ) Need Help??

in reply to Re^3: Database Search Format Engine
in thread Database Search Format Engine


This is not an issue of indexing. In fact, this should be compatible with most any search indexing system. MySQL supports the Asian languages well enough to satisfy me. The difficulty here is more of a Perl problem.

The issue is that of reformating the search from a few search words into a Mysql "SELECT * FROM MyTable WHERE ..." type query.

The core of the Perl issue seems to revolve around word-boundary issues. The Asian languages run all words together, so that a sentence appears as if it were one word (i.e. no white space to delimit words). The \w, \b, \d, etc. are supposed to be compatible with any language, but in actual practice, they have shortcomings when dealing with the double-byte character word boundaries. I have had to replace \w in my code for \p{...} type expressions.

Kino search lists its language compatibilities under "Features" as:

* Full support for 12 Indo-European languages.

My first efforts at making this program work on Chinese also failed miserably. I was disappointed that the Perl regex would not work as it was supposed to according to the documentation I had found. (I had used \w in the beginning.)

So, for KinoSearch to have the same flaw would not surprise me at all. Most programmers do not purposely avoid the common regex tokens just so that they can be certain their code will be compatible with any language.

Who knows...maybe I'm not reinventing this wheel after all?


~ Polyglot ~

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://772591]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2018-02-18 02:20 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (250 votes). Check out past polls.