http://www.perlmonks.org?node_id=1000846


in reply to Re^2: Processing ~1 Trillion records
in thread Processing ~1 Trillion records

If you only need to provide default values for your output, you can do that in SQL directly. I think Oracle has the NVL() function for that.

If that is the only thing your program is doing, then the bottleneck is most likely Oracle, which is not really suited for quickly producing reporting outputs. There are more specialized databases for that, like Sybase IQ.

Update: Thinking more about this, if your database is not idle during the 16 days, the transaction facilities of Oracle will also lessen the database speed. Oracle will try to present a coherent state of the database to each connection, and thus will keep rollback logs for the whole 16 days while your query runs. This creates additional overhead that may be slowing your query down. I would look at setting up a separate instance, or even a dedicated reporting database which has the appropriate indices for your queries instead of using the production database. Consider importing the data from the production database into your reporting database by restoring a backup of the production database. This also serves as a way to find out whether your backups can be restored at all.

Replies are listed 'Best First'.
Re^4: Processing ~1 Trillion records
by jhourcle (Prior) on Oct 25, 2012 at 16:12 UTC

    And if you have enough modifications going on during that time, you'll fill your rollback segment, and the transaction will abort abnormally. (at least, that's how it used to be ... dunno if the latest version still does)

    When I've had to run long transactions on an Oracle database where I didn't have enough storage for a second copy, I had to play with the pragma commands to set the isolation level ... unfortunately, it was almost a decade ago, and I don't remember what the command was. serializable is what keeps coming to mind, but I don't think that's right.