in reply to DBI: How to update 150K records efficiently
I'd summarize my thoughts this way (most are redundant):
- Sometimes the abstractions provided by a class are good and handy; sometimes “desperate times call for desperate measures” and this is a good example of that. What you have done looks fine.
Using a prepared query is definitely helpful.
It is wise to code any such update-routine defensively, even at the expense of speed. You are, after all, about to update 3/4 of the entire table!
If your database-manager supports it, transactions can be very helpful: start a transaction, do a few hundred updates, then commit, then start a new transaction. (Don't forget to commit the last one.) Study “transaction isolation levels.” Consider how much concurrent table-activity you can... or should... tolerate.
Since you are, indeed, updating such a large percentage of the table, and such a large table, what if you did things very differently?
Bang! Now all the updates happen, when they happen, entirely on the server. Or maybe you can actually DROP the old table and rename the temporary so that it becomes permanent. “Desperate times...”
- Upload the information in-quantity to a different table by whatever means.
- Then, do a transaction on the server which orders it to empty the first table, insert into it from the new table, then commit.
No matter how you choose to do it, this is going to be a “10,000 pound elephant” operation. It's going to be a whole lot slower than a lot of the things you're doing, and when it sits down or goes anywhere you're going to feel the floorboards shake.