Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^4: Database Search Format Engine

by creamygoodness (Curate)
on Jun 18, 2009 at 04:21 UTC ( #772631=note: print w/replies, xml ) Need Help??


in reply to Re^3: Database Search Format Engine
in thread Database Search Format Engine

The stable branch of KinoSearch (0.165) doesn't handle UTF-8 properly. You need the dev branch for that (0.20_01 and above). For Asian languages, you absolutely need UTF-8, or support for native encodings like Shift-JIS.

Tokenizing is also quite a challenge for Asian languages, particularly Japanese, and KinoSearch doesn't have a dedicated CJK tokenizer class or anything like that. It's on the todo list, but not very high -- I'm more concerned with making sure that the framework will allow others to write high-performance KSx extensions than with writing everything myself.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://772631]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2017-06-26 20:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many monitors do you use while coding?















    Results (590 votes). Check out past polls.