Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^2: DBI: How to update 150K records efficiently

by graff (Chancellor)
on Apr 01, 2008 at 04:54 UTC ( #677693=note: print w/ replies, xml ) Need Help??


in reply to Re: DBI: How to update 150K records efficiently
in thread DBI: How to update 150K records efficiently

If I'm not mistaken, the use of LOAD DATA INFILE (or oracle "sqlldr") is only suitable for doing insertions to a table -- it doesn't do updates. (But maybe I'm mistaken?)

In any case, since the OP is clearly talking about UPDATE, I think the "REPLACE" keyword would definitely not be good, unless the input to the process was in fact a complete set of replacement records for the table. And even then it might still be a really bad idea, if the table involves autoincrement primary keys that are involved in foreign key relations elsewhere. (Because reloading the whole table is apt to assign a completely different set of autoincrement keys.) Again, if I am wrong, I'd love to know...


Comment on Re^2: DBI: How to update 150K records efficiently
Re^3: DBI: How to update 150K records efficiently
by jhourcle (Prior) on Apr 01, 2008 at 15:32 UTC

    Although the primary use is insertion, they can be used for updates ... mysql directly, oracle in a more round-about way.

    The 'REPLACE' keyword in mysql's 'LOAD DATA' comment is record-by-record, based on the primary key of the table. If you're using autoincrement for for primary key, this isn't going to work for you. In the example given, however, there's a WHERE clause using the field 'vc', which I assumed was the primary key, or at least a unique index.

    You're right that sqlldr doesn't handle this case. (its REPLACE keyword will remove the entire table, as will TRUNCATE. The only other option for tables w/ existing data is 'APPEND'.) However, you can use sqlldr to get the data into a temp table, and then use a pl/sql command to replace the records as needed (or insert into the temp tables those that don't have a corresponding record already). You'd have to benchmark it to see what's the best method for your specific situation.

    If you're going to be designing a table that requires regular bulk updates, I'd highly recommend finding a suitable key (even if it's a composite key) and not use a sequence or autoincrement

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://677693]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2015-07-05 15:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (67 votes), past polls