http://www.perlmonks.org?node_id=1198017


in reply to Re: copying mysql table data to oracle table
in thread copying mysql table data to oracle table

While the logic is sound, it is not a good suggestion, and for more than one reason.

First: you say to start a transaction and run 3 queries. This is not a good idea, for a couple reasons itself. One, another user may insert a record between the check and the insert. Two, a user may delete the record between the check and the update. In both cases the transaction will fail, causing the record to be lost, unless you lock the table with either lock table or select for update. These are rather inefficient methods.

Second: Checking if a record exists for the purpose of a separate insert or update, outside of a where clause, is absolute silliness. Indeed, almost any insert or update that may cause a collision should be using a where clause anyway. Using insert...where not exists and update...where not exists, is an atomic statement and both the where clause and the insert or update clause will be able to used the cache for record checking. If there check is done in a different statement, while it is likely to still be cached, it may not be, because it is now done and over with and may have been unloaded by the database.

Third: we're taking Oracle here, and in Oracle, there is MERGE ("upsert") statement, which does exactly what the op wants to do.

Fourth: your suggestion includes using non-sql checking, which means either perl or pl/sql. This would include the invoking of an engine (pl/sql or perl) outside of sql, which is not the most efficient method. Being this requires sql (well, unless data loading is used, but i do not think that is one of the options here), and can be done completely in sql, to use anything other than sql (even pl/sql!) is redundant and slower.

Fifth: committing the transaction is not only redundant, it will slow down the entire operation, and removes the ability of rollback of more than one record.

Sixth: Even if this would be used, a more efficient method would be a bulk insert.

Anyway, the logic in your post is how a programmer might think. But this is a database, where we work with sets. Think set-based logic, not record-based. That's just plain inefficient.