Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^4: Database Search Format Engine

by creamygoodness (Curate)
on Jun 18, 2009 at 04:21 UTC ( [id://772631]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Database Search Format Engine
in thread Database Search Format Engine

The stable branch of KinoSearch (0.165) doesn't handle UTF-8 properly. You need the dev branch for that (0.20_01 and above). For Asian languages, you absolutely need UTF-8, or support for native encodings like Shift-JIS.

Tokenizing is also quite a challenge for Asian languages, particularly Japanese, and KinoSearch doesn't have a dedicated CJK tokenizer class or anything like that. It's on the todo list, but not very high -- I'm more concerned with making sure that the framework will allow others to write high-performance KSx extensions than with writing everything myself.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://772631]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-03-29 10:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found