Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re^5: Hash lookups, Database lookups, and Scalability

by mpeppler (Vicar)
on Oct 31, 2004 at 16:19 UTC ( #404144=note: print w/replies, xml ) Need Help??

in reply to Re^4: Hash lookups, Database lookups, and Scalability
in thread Hash lookups, Database lookups, and Scalability

Indexes and query behavior is usually pretty tied to the way a particular database engine works - and I don't know SQLlite at all, so I can't really help you with specifics.

However, your table schema is exceedingly simple, so you really only have two choices:

create unique index left_ix on words(left)
create unique index left_ix on words(left, right)
(and their opposites).

The first form is more "correct" - you really only want the key in the index. The second form may give you slightly better performance, at the expense of allowing duplicate "left" words into the table as long as they point at a different "right" word, and slightly more work during inserts (index maintenance is a little more complicated).

Personally I'd use the first form (index on "left", and a separate index on "right").


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://404144]
[ELISHEVA]: Simple yes. and I did consider that. but this isn't one off . An important data source that I don't control is generating bom prefixed utf8 files and I'd rather not have to be munging files every few months.
[erix]: on teh other hand a SOPW is pretty much garanteed to get an answer from tux (and probably the module fixed)
[ELISHEVA]: plus it bugs me that something that *should* be simple, *should* work- unicode and noms aren't exactly the new kids on the block
[ELISHEVA]: well then since the obvious possible mistakes on my part have been ruled out, SOPW it is.
[ELISHEVA]: the data source, or one of them, is the OECD - they provide a *lot* of data that ought to be easily available to perl programmers.
[erix]: it might be cunning to mention the module in the title... :)
[ELISHEVA]: fancy that - a title that actually describes the problem :-)
[ELISHEVA]: but actually thanks for the reminder
[Discipulus]: DBI::CSV + utf8 = BOO?M
[erix]: in extremis we tend to forget stuff ;)

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2017-05-28 20:36 GMT
Find Nodes?
    Voting Booth?