Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^2: RDB updates: rebuild from scratch vs. incremental

by jhourcle (Prior)
on Aug 30, 2005 at 15:27 UTC ( [id://487808]=note: print w/replies, xml ) Need Help??


in reply to Re: RDB updates: rebuild from scratch vs. incremental
in thread RDB updates: rebuild from scratch vs. incremental

During the recreation of your database, your database would be unavailable for your users.

Depending on how you rebuild your data, that may not be true.

I reload a number of tables every morning, and the process takes about 25 minutes for me to acquire the data, and do the pre-processing for the application.

For each table, as I am processing it, I first remove tables named backup_$name. I next load the data into temp_$name, and perform whatever processing needs to be done, such as sanitizing the data, generating derived values, and indexing. Once all of the tables are loaded, I loop through the list of tables and rename $name to backup_$name and rename temp_$name to $name.

Although there's added load on the database every morning (both the one I'm exporting from, an the one that I'm importing to), both databases remain available through the process (I've yet to have anyone complain about issues when the tables are getting renamed, but there is that possibility ... it's still significantly less than the time that it takes to recreate the database).

In my situation, I have the luxury of disk space -- it's possible that someone with a larger database may not have that. (I keep the backups not just for the ability to roll back, but also to make sure that other folks on the machine don't try to use the space that I'm going to need for the reprocessing)

I'll go back to one my overused mantras -- There is no one best solution for every scenario. There are ways of doing this with either incremental updates, or a complete rebuild. Without knowing more about the nature of the data, I don't think it'd be good to recommend either one, because this data may be completely different from everything that I've ever worked with.

  • Comment on Re^2: RDB updates: rebuild from scratch vs. incremental

Replies are listed 'Best First'.
Re^3: RDB updates: rebuild from scratch vs. incremental
by jfroebe (Parson) on Aug 30, 2005 at 17:03 UTC

    I agree :) It would also depend on the chosen DBMS on how much time the tables would be inaccessable to the users. In your case, it is probably less than a second. On low utilized servers, this is probably more than acceptable.. in other scenerios, the loss of a half second could amount to be thousands of dollars (trading systems for example).

    Definitely, the possible solutions will vary :)

    Jason L. Froebe

    Team Sybase member

    No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://487808]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2025-07-10 16:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.