http://www.perlmonks.org?node_id=677693


in reply to Re: DBI: How to update 150K records efficiently
in thread DBI: How to update 150K records efficiently

If I'm not mistaken, the use of LOAD DATA INFILE (or oracle "sqlldr") is only suitable for doing insertions to a table -- it doesn't do updates. (But maybe I'm mistaken?)

In any case, since the OP is clearly talking about UPDATE, I think the "REPLACE" keyword would definitely not be good, unless the input to the process was in fact a complete set of replacement records for the table. And even then it might still be a really bad idea, if the table involves autoincrement primary keys that are involved in foreign key relations elsewhere. (Because reloading the whole table is apt to assign a completely different set of autoincrement keys.) Again, if I am wrong, I'd love to know...

  • Comment on Re^2: DBI: How to update 150K records efficiently

Replies are listed 'Best First'.
Re^3: DBI: How to update 150K records efficiently
by jhourcle (Prior) on Apr 01, 2008 at 15:32 UTC

    Although the primary use is insertion, they can be used for updates ... mysql directly, oracle in a more round-about way.

    The 'REPLACE' keyword in mysql's 'LOAD DATA' comment is record-by-record, based on the primary key of the table. If you're using autoincrement for for primary key, this isn't going to work for you. In the example given, however, there's a WHERE clause using the field 'vc', which I assumed was the primary key, or at least a unique index.

    You're right that sqlldr doesn't handle this case. (its REPLACE keyword will remove the entire table, as will TRUNCATE. The only other option for tables w/ existing data is 'APPEND'.) However, you can use sqlldr to get the data into a temp table, and then use a pl/sql command to replace the records as needed (or insert into the temp tables those that don't have a corresponding record already). You'd have to benchmark it to see what's the best method for your specific situation.

    If you're going to be designing a table that requires regular bulk updates, I'd highly recommend finding a suitable key (even if it's a composite key) and not use a sequence or autoincrement