Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Proper Syntax for DBI multiple row insert

by Rodster001 (Pilgrim)
on Dec 10, 2015 at 23:43 UTC ( [id://1149972]=perlquestion: print w/replies, xml ) Need Help??

Rodster001 has asked for the wisdom of the Perl Monks concerning the following question:

I found this code snipit but it doesn't quite work:
my @records = ( ['value1', 'value2'], ...) ; my $values = join ", ", ("( ?, ? )" x @records); my $query = "INSERT INTO $table (field1, field2) VALUES $values"; my $sth = $dbh->prepare($query); $sth->execute(map { @$_ } @records);
I modified it a bit to get it to work (below). The above code doesn't set the $values variable correctly, that join doesn't work.
my @list; foreach (@records) { push(@list,"( ?, ?, ? )"); } my $values = join(',', @list);
That works although not as cool. I've never seen a list created quite like this before, and I am curious:
join ", ", ("( ?, ? )" x @records);
Can someone correct that bit to show what the original author was trying to do?

UPDATE

I think I confused this post by leaving the MySQL stuff in there (and choosing a poor title). This is actually a Perl question, but I didn't state it clearly. What I should have asked is how to properly use the repetition operator in a join to produce a certain string. The "x" operator use is incorrect in my example (needs a list on both sides, the right paren is in the wrong place). Mr. Muskrat (below) answered my question (he also showed the broken string with the fixed string).

Replies are listed 'Best First'.
Re: Proper Syntax for DBI multiple row insert
by Mr. Muskrat (Canon) on Dec 11, 2015 at 16:03 UTC

    The x operator does not work that way. Consider this:

    #!/bin/env perl use strict; use warnings; use Data::Dumper; my @records = ( ['value1', 'value2'], ['value3', 'value4'], ['value5', + 'value6']); my @broken = ('( ?, ? )' x @records); print 'broken: ', Dumper \@broken; my @fixed = ('( ?, ? )') x @records; print 'fixed: ', Dumper \@fixed; __DATA__ broken: $VAR1 = [ '( ?, ? )( ?, ? )( ?, ? )' ]; fixed: $VAR1 = [ '( ?, ? )', '( ?, ? )', '( ?, ? )' ];

    This example leads us to the corrected line of your original code:

    my $values = join ", ", ("( ?, ? )") x @records;

      Great, thank you. That's what I was looking for.
Re: Proper Syntax for DBI multiple row insert
by runrig (Abbot) on Dec 10, 2015 at 23:50 UTC
    First, this has nothing to do with DBI, but seems to be some special syntax for MySQL multiple row insert, so the best thing to do would be to print the SQL statement before you execute it, to make sure you have the SQL statement that you think you do. I don't see offhand why the first example wouldn't work, and "doesn't work" is not a very good explanation of the problem.
Re: Proper Syntax for DBI multiple row insert
by jcb (Parson) on Dec 11, 2015 at 04:37 UTC

    While this is standard SQL, or at least widely-enough supported that PostgreSQL also allows a single INSERT to add multiple rows, this builds an SQL query dynamically, which is generally a bad habit, since it can create a risk for SQL injection attacks if user-provided data is used to build the query.

    Unless you have unusual requirements, a better way to achieve this is to prepare a statement that inserts a single row and then execute that statement once for each row, like the DBI manual(section "Outline Usage") suggests: (untested)

    my @records = ( ['value1', 'value2'], ...) ; my $sth = $dbh->prepare("INSERT INTO $table (field1, field2) VALUES (? +,?)"); foreach my $valuesref (@records) { $sth->execute(@$valuesref) }

    The other reason to do it this way is that the statement handle can be kept around and reused if records are to be inserted into the same table more than once. You can use $dbh->prepare_cached instead of $dbh->prepare, but see the caveats that are explained in the DBI manual.

Re: Proper Syntax for DBI multiple row insert
by Apero (Scribe) on Dec 11, 2015 at 13:28 UTC

    I did a write-up on the significance of network latency in the DBI to a remote RDBMS that you will probably find useful, which you can find here. As I demonstrate in that discussion, the impact of per-row commits (all done inside a single transaction) is highly significant if your RTT to the database server is much larger than a local LAN.

    In my lab setup (using normal delays we get at $work between 2 cross-country data-centers,) I show a 5400% slowdown using a per-row approach. Based on this evidence, I disagree with the earlier suggestions to only ever insert 1 row at a time; if you are on a local network, this might be fine, but as always, you should do your own testing and tuning. Many times there's no need to complicate things early (also known as premature optimization.)

    As long as you use prepared statements, the "built" query will be safe. The fact that you're dynamically building the (?),(?) sequence makes no difference in the protection you get by using a placeholder and executing by passing in the data you want.

    The one hint I will give you is that it's usually also a good idea to cap the number of rows you send; if you're inserting many thousands of rows, you may want to batch them in groups of no larger than 500 or 1000, committing after each. This limits the impact of a network or other error partway into an insert of a 1 million rows, for instance.

    I hope my writeup and findings can be useful for you. Good luck!

Re: Proper Syntax for DBI multiple row insert
by graff (Chancellor) on Dec 11, 2015 at 05:13 UTC
    I would second the advice in the previous reply: you're better off sticking with a simple prepared statement with suitable placeholders to insert one row per execution, and just execute the statement in a loop over the rows of data to be inserted.

    If you're worried about the relative overhead of many execution calls (one row at a time) versus one execution call (with many rows at once), bear in mind that if you turn off auto_commit, and explicitly commit after all the row executions have been done, there shouldn't be any serious difference in run-time, relative to doing many rows in one execute call.

    I could be wrong about that - I've never tried a multi-row insert via DBI, so I haven't had occasion to benchmark it. But I do know that when doing lots of single-row executions (for inserts or updates), it can make a BIG difference (orders of magnitude) if you commit just once at the end -- or, if you're doing 10's or 100's of thousands of row inserts, do a commit every few thousand rows, just to keep the transaction size from getting out of hand.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1149972]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-04-24 08:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found