http://www.perlmonks.org?node_id=907835


in reply to Re^2: DBI, place holders and CGI forms
in thread DBI, place holders and CGI forms

You generally use placeholders or DBI's quote but usually not both. I never use quote myself. placeholders are generally the way to go because you can often prepare the SQL once with the placeholders and execute as many times as you like - thus saving parsing and preparing of the SQL repeatedly.

When DBI says don't quote placeholders it means don't run the quote method on placeholders i.e., don't do:

$s = $h->prepare(/select * from table where a = ?/); $s->execute($h->quote("myvalueforcolumna"));

You use quote when you are building all the SQL yourself and not using placeholders so the above would be:

$s = $h->prepare(q/select * from table where a = / . $h->quote("myva +lueforcolumna");

Try and stick with placeholders. I don't know your schema but you can usually do something like:

select col1, col2, col3 from tab_name where (? is null or foo = ?) a +nd (? is null or bar = ?) and (? is null or baz = ?)

then pass each value for foo, bar and baz twice

Replies are listed 'Best First'.
Re^4: DBI, place holders and CGI forms (quote++)
by tye (Sage) on Jun 02, 2011 at 18:25 UTC
    I never use quote myself.

    Actually, quote() is the more general solution. I've run into quite a few different problems trying to use placeholders. And I've never noticed any performance problem with my failure to re-use prepared queries. I've actually had more problems due to people being too aggressive about trying to re-use prepared queries such as using prepare_cached() way too much.

    A significant fraction (perhaps a majority) of the environments where I've done significant DBI work were such that placeholders had no performance impact at all. For example, the default with DBD::mysql seems to still be to not do server-side prepares.

    At more than one recent employer I've experienced upgrading DBD::Pg such that prepares started happening server-side and then having to disable server-side prepares due to more than one resulting problem (the query planner got significantly stupider in a few important cases, it was incompatible with PgBouncer or other middle-layer tools, and other problems).

    We had some breakage at PerlMonks because "limit ?" worked fine on older versions of DBD::mysql but then stopped working after some ugprade.

    And I've had significant problems with placeholders when using DBD::ODBC. The first batch of problems were around certain data type checking such that "where somedate = '2011-01-01'" is no problem but "where somedate = ?" plus passing '2011-01-01' to execute() gave some "type mismatch" error.

    The scariest problem was using DBD::ODBC and having "where somedate between cast(? as datetime) and cast(? as datetime)" work without emitting any errors or warnings but then returning the wrong data. But, of course:

    join ' ', ... "where somedate between", $dbh->quote($start), "and", $dbh->quote($end),

    worked exactly as expected (whether "cast( ... as datetime)" was included or not).

    So, I like being able to use placeholders. But way too often that actually requires being able to disable server-side prepares which then just boils down to $dbh->quote() except that the DBD module is doing that call for you (and thus removes your claimed advantage).

    Actually, the majority of my experience using DBI is via some wrapper (quite a few different ones, some extremely thin, some only moderately thin) that takes my column names and column values and assembles at least some of the SQL for me.

    - tye        

      Preparing a statement once and executing many times is nearly always faster than preparing it many times. Look at my simple benchmark I posted at Re^5: DBI, place holders and CGI, 3s vs 16s. Now, I now this was a lot of iterations (10000) but if the SQL is a lot more complex it can make a much bigger difference. As I'm sure you are aware, the database engine is having to parse the SQL and work out an execution plan every time if you keep calling prepare (and yes, I know some databases cache plans). However, that ignores the extra round trips the DBD is doing as well and for some that can be very expensive. Most DBDs parse the SQL themselves to look for placeholders (some convert the SQL, e.g., ? becomes :1 or $1), send the SQL to the server, make another call(s) to find out details about the result set then call execute with the placeholders. So, I don't think recommending quote is good advice in general although I concede it is the "more general solution" most likely to work with all DBDs.

      I cannot really comment much on MySQL as I don't use it these days but I feel I can comment on DBD::ODBC (as I maintain it) and DBD::Oracle (as I help maintain it). Both are demonstrably faster reusing a prepared statement.

      Regarding placeholders in DBD::ODBC. The first thing you need to recognise is that few open source ODBC drivers implement even 70% of ODBC and the 70% they do implement often falls short in many different ways. The main ODBC API DBD::ODBC wants for placeholder support is SQLDescribeParam and few (if any) open source ODBC drivers support it at all. Most commercial drivers do support SQLDescribeParam although admittedly some don't do so well when the SQL is complex (more below). Without SQLDescribeParam support DBD::ODBC has to bind everything as a string to be interpreted as a string meaning you often end up having to add cast etc around your strings all over the place (especially dates/times).

      As a side point, few people seem to know ODBC has a defined syntax for dates/times using {d 'yyyy-mm-dd'}, {t 'hh:mm:ss.fff'} and {ts 'yyyy-mm-dd hh:mm:ss.fff'} and all drivers should at least support that.

      Why is SQLDescribeParam not supported that often? Because it is difficult to implement and usually requires reorganising your SQL. e.g., you do "insert into table (a) values(?)" and the driver basically reorganises this into "select a from table" so it can then find out about column a and hence the type, size etc of the placeholder. When the SQL is straight forward like this example it is fairly easy but even a simple query with an insert can be difficult to work out e.g., "select some_func(?) from table where some_column = 1", how does the driver reorganise that SQL to describe the placeholder when it does not refer to a column! Mostly, they don't and SQLDescribeParam will fail but in some rare cases I've seen the driver will actually reorganise the SQL incorrectly (but so it works) and then incorrectly describe the parameter.

      So I suspect your apparent bad experience with placeholders is down to bugs in the ODBC driver or database engine. They could be DBD::ODBC but as I seldom receive any bug report associated with an open source ODBC driver (other than freeTDS which seems to be used a lot) it is difficult to know.

      As for DBD::Oracle, I use placeholders all over the place in a large system and I've never had any problem with them that wasn't a bug and fixed. If you are repeatedly executing the same SQL then preparing it once is definitely faster and you can even see this in the enterprise manager stats.

      I don't use prepare_cached and have no need to as most of my Perl is running in daemons so I prepare once.

Re^4: DBI, place holders and CGI forms
by Anonymous Monk on Jun 03, 2011 at 07:57 UTC
    Thanks for this reply. I just tried this in sqlplus and I'm starting to understand your advice. Thank you again. I'll test this with DBI in a while. Is there any way via DBI or SQL plus I can easily benchmark two statements? Thank you.

      I've seen various ways in sqlplus to time queries. Here is a simple one:

      set autotrace on; set timing on; select count(*) from table; COUNT(*) ---------- 164078859 Elapsed: 00:01:02.02 Execution Plan ---------------------------------------------------------- Plan hash value: 2319838783 ---------------------------------------------------------------------- +------- | Id | Operation | Name | Rows | Cost (%CPU)| Time + | ---------------------------------------------------------------------- +------- | 0 | SELECT STATEMENT | | 1 | 87965 (2)| 00:17: +36 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| PRICE_IDX18 | 152M| 87965 (2)| 00 +:17:36 | ---------------------------------------------------------------------- +------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 321153 consistent gets 321106 physical reads 0 redo size 422 bytes sent via SQL*Net to client 420 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed

      Using Perl DBI you can quickly do something like:

      use DBI; use strict; use warnings; use Benchmark; my $h = DBI->connect('dbi:Oracle:host=xxx;sid=yyy','xxx','xxx'); my $s = $h->prepare(q/select * from action where action_id = ?/); timethese(10000, { 'placeholders' => sub {placeholders($s)}, 'quote' => sub {quote($h)}}); sub placeholders { $_[0]->execute(1); } sub quote { $s = $h->prepare("select * from action where action_id = " . $_[0] +->quote(1)); $s->execute; }
      $ perl bm.pl Benchmark: timing 10000 iterations of placeholders, quote... placeholders: 3 wallclock secs ( 0.43 usr + 0.20 sys = 0.63 CPU) @ +15873.02/s (n=10000) quote: 16 wallclock secs ( 5.51 usr + 1.32 sys = 6.83 CPU) @ 14 +64.13/s (n=10000)
        So the advice works great for selects, I'm now thinking about how can I do this with insert statements, the same idea, I don't know what columns have been submitted. Will the same idea work here?