Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Batch INSERT performance

by Herkum (Parson)
on May 05, 2009 at 19:42 UTC ( [id://762036]=perlquestion: print w/replies, xml ) Need Help??

Herkum has asked for the wisdom of the Perl Monks concerning the following question:

I am saving a bunch of objects to database, and right now I am saving each object individually with its own INSERT statement. So 1000 objects, 1000 insert dbi calls.

I was thinking I could get a performance boost if I made it one large SQL statement and do a batch insert like below

INSERT INTO table (description) VALUES (?),(?),(?) etc...

However, before I go down this road I was wondering if I would really get any performance benefits. Statement handles are already prepared, so that is not an issue, but if I insert 1000 objects, that means I am binding about 40,000 variables to this INSERT statement. I am wondering would that really bring any performance improvements or would I be better served by interpolating the values into the SQL?

Any have advice on the subject?

Replies are listed 'Best First'.
Re: Batch INSERT performance
by ig (Vicar) on May 05, 2009 at 20:28 UTC

      I believe that the information is what I wanted to know, THX!

Re: Batch INSERT performance
by samtregar (Abbot) on May 05, 2009 at 20:10 UTC
    What database are you using? This is faster with MySQL, often by a significant margin. Even faster is LOAD DATA, but that requires some moderately annoying data formatting and error handling is harder (error ignoring is easy though!).

    -sam

      The database is Mysql.

      The LOAD DATA seems to use files, and I have no intention of writing files to do the data import. I do admit I did not know about the LOAD DATA option though, I might use it for something else.

      However, would placeholders be problematic for a large data import?

        Herkum:

        You might not want to be so quick to dismiss the idea. I'm not knowledgeable about Mysql, but for MS SQL, Sybase and Oracle, the fastest way you can load data into a table is to bulk load it in from a flat file. And we're not talking about a five percent increase in speed--it's normally an order of magnitude faster.

        ...roboticus
        Placeholders should work fine. I've used placeholders for statements with tens of thousands of parameters.

        -sam

Re: Batch INSERT performance
by Jenda (Abbot) on May 05, 2009 at 23:47 UTC

    With most databases you'd get a nice performance boost by something as simple as turning the AutoCommit off and committing only once every N rows. (The N depends on a lot of things, the row size, the database server, the table size, ... you'd have to benchmark it.) I do doubt interpolating the values into the SQL is the right thing to do. Depends on the database.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://762036]
Approved by samtregar
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-24 11:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found