Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: PERL DB Optimization

by Tuppence (Pilgrim)
on Dec 18, 2003 at 23:43 UTC ( [id://315693]=note: print w/replies, xml ) Need Help??


in reply to PERL DB Optimization

Before I offer my thoughts on your question I must first comment on one issue I see in your code, namely the use of bind params. Apologies if you already know this, but..

SQL queries should follow this form:

my $sth = $dbh->prepare('update data_set2 set data = ? where key2 = ?'); $sth->execute($val1, $key);
This allows you to gain several benefits, including proper escaping for your data, cachability of statement handles (i.e. using the same statement handle for multiple updates, cutting time because the DB server does not have to re-parse the SQL) and you don't get people complaining at you to use bind params ;)

Now, on to your question.

I have done several different styles of what you are trying to do, and the highest performance I have been able to get out of the process is by putting more intelligence in the SQL and less in the perl. If you can make the database do more work, the perl has to work with less and will therefore run faster. If you do not have a database that supports subselects and joins, this may be harder then it otherwise would be.

For instance, your example looks like 2 problems.
  1. creating new records for non existing
  2. updating records that already exist

The first can be handled by getting a list from the database of only those records that don't exist, i.e.

SELECT id_field, field_to_update FROM table_1 WHERE id_field NOT IN (SELECT join_id_field FROM table_2)
and the second can be handled with slightly more complicated SQL, like this:
SELECT src.id_field, src.field_to_update FROM table_1 src, table_2 dest WHERE src.id_field = dest.join_id_field AND src.field_to_update != dest.field_to_be_updated
This will get you a list of the records that need to be updated.

To take this a step further, if your database supports it you can even do the updates purely on the database side, although that query is much more difficult and I do not have time or inclination to figure it out for an example problem :)

Hope that helps

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://315693]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-20 03:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found