Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re^3: Hash lookups, Database lookups, and Scalability

by tilly (Archbishop)
on Nov 01, 2004 at 15:58 UTC ( #404331=note: print w/replies, xml ) Need Help??

in reply to Re^2: Hash lookups, Database lookups, and Scalability
in thread Hash lookups, Database lookups, and Scalability

"I'd expect most database lookups to scale like O(log(n)) for a lookup"
Why? Don't simplify the situation. Database search could range from indexed search to a full table scan. As for indexed search, lots of RDBMS system's look up is indeed hash look up, and index is the way you tell database what hashes to create. Those numbers the OP given is not purely for look up, instead it is a mixture of everything including IO and network communication, thus no way they can be used as it is, to measure the performance of the database search algorithm.
My statement was entirely based on theory.

The description given was selecting a single value from a table with indexes on both columns. That means that the lookup is happening on an index. For the big-O estimate you have to look at what happens as the dataset gets large. Network communication is a constant factor and I/O is part of the search time.

My statement about O(log(n)) therefore presumes that you are using an index with a large dataset. The question then becomes what kind of index. There are many kinds of indexes out there. Yes, you can use a hash and get O(1). But default indexes tend to be a hierarchical datastructure that is O(log(n)), because they cooperate better with caching to avoid I/O, leading to a much better constant. (That is why the search algorithm should be chosen with I/0 in mind.)

About your coding comments, I have no disagreement with that and have said similar things on many occasions myself.

  • Comment on Re^3: Hash lookups, Database lookups, and Scalability

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://404331]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2020-03-30 20:37 GMT
Find Nodes?
    Voting Booth?
    To "Disagree to disagree" means to:

    Results (176 votes). Check out past polls.