Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Fetch Mysql huge database

by sundialsvc4 (Abbot)
on Sep 29, 2011 at 17:32 UTC ( #928626=note: print w/ replies, xml ) Need Help??


in reply to Re: Fetch Mysql huge database
in thread Fetch Mysql huge database

Indexes would be quite mandatory, to the point that some DBMSes will temporarily create indexes in order to do such a join, or refuse to do the join.

Even if you do wind up creating indexes only to immediately drop them, or if you set up an index that is “non-maintained” and it must be refreshed now and then, c’est la guerre.   Just a cost of doing business.

(As of course you surely know... but just for the benefit of any onlookers who maybe don’t...) Left and right so-called “outer” joins are those which cause all of the rows in the table on the left or the right side of the relationship to appear even if they have no matching row on the other side.   Testing for NULL on the right side means that it is a non-matching row.

One characteristic of the original strategy is that it tries to use “memory” ... and when you have millions of anything, you absolutely cannot use “memory” to do it.   If you don’t literally run out of the stuff (after hours of painful grinding in which your disk-drive might sue for workmen’s comp), you might spend days waiting for a solution, as thrashing sends everything on your computer system to the nether regions.   Whereas the inner-join approach will be virtually instantaneous.   It won’t even take much memory, because all the SQL server’s doing is marching through two compatible index trees.

Incidentally... “38 million rows” really isn’t that outrageously large.   Oh, but it’s large enough to put a bit of a test on your strategy and to call you out rather quickly if you didn’t pick a good one, though.   So, if you try something and you observe that the system is really getting bolluxed up, it’s time to stop and reconsider how you’re trying to solve this problem.   “There’s more than one way to do it™” and you can be sure that there must be a much-better way.   “Okay, time to sit down and eat a sandwich here... I’m obviously lost in the woods... that is to say, ‘momentarily displaced.’ ”


Comment on Re^2: Fetch Mysql huge database
Re^3: Fetch Mysql huge database
by aaron_baugher (Deacon) on Sep 29, 2011 at 20:37 UTC

    Yes. On a large table, indexes are essential for any field (or group of fields) which you're going to use to filter or order your results. Create an index on the UserName field here, and it'll drastically speed things up.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://928626]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (14)
As of 2014-10-31 16:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (221 votes), past polls