Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re^4: Writing a database lookup tool

by elef (Friar)
on Jan 18, 2013 at 11:54 UTC ( #1014040=note: print w/replies, xml ) Need Help??

in reply to Re^3: Writing a database lookup tool
in thread Writing a database lookup tool

"FYI - current versions include FTS (Full text search) capabilities."
Thank you for that info. Full text search looks like it was designed for precisely the type of queries I'd be using. Based on the descriptions I found, it would add a lot of functionality and a lot of speed. Now, the main question is: do I get FTS with the perl database modules? The DBI::DB and DBD::SQLite cpan pages don't mention FTS, but it seems to be a pretty old feature so it should have trickled down, right?

Edit: I digged around a bit more and found out that FTS is supported by DBD::SQLite:

"However, I got the impression you wanted a stand-alone (not web-server based) solution."
Yes. The database and the lookup tool would be on the "client" machine, and ideally the whole thing should be reasonably compact and self-contained. One not too large (hopefully <50MB without the data) download, one not too complex installation.

Replies are listed 'Best First'.
Re^5: Writing a database lookup tool
by elef (Friar) on Jan 20, 2013 at 09:02 UTC
    Update: I started playing with DBD::SQLite. I imported 800,000 records into a db, and quickly realized that by default, most of my searches are indeed carried out as sequential searches, which makes them hopelessly slow. With FTS enabled, average lookup times fell from 4 seconds to 0.04 seconds. If this scales in a roughly linear fashion and I get 0.5 sec lookups on 10 million records, I'll be very happy with the speed.
    Thanks again.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1014040]
[Corion]: marinersk: Well, I have done select statements like select sum(foo) as "Total Amount", ..., but to have a table like that makes me shudder
[Corion]: SuicideJunkie: :-D
[marinersk]: SuicideJunkie LOL
[choroba]: Woohoo! Fixed a test that hasn't run for 3 years.
[marinersk]: Corion Yes, sometimes whitespace in column headers is acceptable, but I still consider it be less than desireable if that query might get revectored for an ETL-esque process...
[marinersk]: choroba++
[choroba]: it's a long running test, so it's normally skipped unless an env var is set
[choroba]: nobody has been bothered to set the variable in the last 3 years
[marinersk]: sub newtest{my $expected_result = &target('foo'); my $actual_result = &target('foo'); if ($actual_result eq $expected_result) { &tdd_success(); } else { &tdd_fail(); } } # Test works after three years!
[choroba]: or nobody bothered...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (11)
As of 2017-05-25 15:08 GMT
Find Nodes?
    Voting Booth?