Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re^3: Processing ~1 Trillion records

by MidLifeXis (Monsignor)
on Oct 25, 2012 at 12:29 UTC ( #1000845=note: print w/ replies, xml ) Need Help??

in reply to Re^2: Processing ~1 Trillion records
in thread Processing ~1 Trillion records


Timings I would think would be interesting would be the time taken in the $sth_chr->execute statement, the translate call (is it something that could be done more efficiently in the select call?), the time spent in the if(!defined($x)){$x = 0} statements vs having the DB pre-populate that for you, the time spent in the sorting of the keys, and anything else that shows up in a profiling run.

This (as stated in other posts on this thread) is speculation, but I wonder if the sort could be sped up (if it is a bottleneck) by breaking the keys unsorted out, partitioning the data, sorting each individually, and merging the results. The other possibility, if it is the output that is the bottleneck and not the sort, is that the additional I/O would increase the time. Without profiling data it is impossible to know where to focus.


Comment on Re^3: Processing ~1 Trillion records
Select or Download Code
Replies are listed 'Best First'.
Re^4: Processing ~1 Trillion records
by sundialsvc4 (Abbot) on Oct 25, 2012 at 16:06 UTC

    Indeed, if you can extract a list of what the so-called “markers” and “lines” are (which of course you can), then the problem decomposes itself quite-nicely into partitions.   All that you really need are indexes on the columns in question.

    It might also be advantageous to define a view that corresponds to this query, so that you can examine and then optimize-the-hades-out-of that view.   In some DBMSes, defining a view serves to introduce the query as one that it now knows to be frequently-used, and it certainly puts that “6 days long” requirement squarely into the DBA’s management field-of-view.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1000845]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2015-11-28 00:35 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (735 votes), past polls