Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^3: Processing ~1 Trillion records

by MidLifeXis (Prior)
on Oct 25, 2012 at 12:29 UTC ( #1000845=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Processing ~1 Trillion records
in thread Processing ~1 Trillion records

Profile.

Timings I would think would be interesting would be the time taken in the $sth_chr->execute statement, the translate call (is it something that could be done more efficiently in the select call?), the time spent in the if(!defined($x)){$x = 0} statements vs having the DB pre-populate that for you, the time spent in the sorting of the keys, and anything else that shows up in a profiling run.

This (as stated in other posts on this thread) is speculation, but I wonder if the sort could be sped up (if it is a bottleneck) by breaking the keys unsorted out, partitioning the data, sorting each individually, and merging the results. The other possibility, if it is the output that is the bottleneck and not the sort, is that the additional I/O would increase the time. Without profiling data it is impossible to know where to focus.

--MidLifeXis


Comment on Re^3: Processing ~1 Trillion records
Select or Download Code
Re^4: Processing ~1 Trillion records
by sundialsvc4 (Abbot) on Oct 25, 2012 at 16:06 UTC

    Indeed, if you can extract a list of what the so-called “markers” and “lines” are (which of course you can), then the problem decomposes itself quite-nicely into partitions.   All that you really need are indexes on the columns in question.

    It might also be advantageous to define a view that corresponds to this query, so that you can examine and then optimize-the-hades-out-of that view.   In some DBMSes, defining a view serves to introduce the query as one that it now knows to be frequently-used, and it certainly puts that “6 days long” requirement squarely into the DBA’s management field-of-view.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1000845]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (12)
As of 2014-10-30 13:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (208 votes), past polls