Beefy Boxes and Bandwidth Generously Provided by pair Networks chromatic writing perl on a camel
Do you know where your variables are?
 
PerlMonks  

Re^3: same query, different execution, different performance

by runrig (Abbot)
on Feb 14, 2012 at 17:07 UTC ( #953726=note: print w/ replies, xml ) Need Help??


in reply to Re^2: same query, different execution, different performance
in thread same query, different execution, different performance

Don't even think about $dbh->quote(), use SUBSTR instead of LIKE whenever you need to test the start of a string against a LIKE-pattern

First, sometimes you should think about using quote(). Second, if you use SUBSTR(), again, the database won't use the index on the column, unless your database supports function based indexes (and I assume Postgres does and that there's an index on Lower(a)), and you have a function based index on the column, etc.


Comment on Re^3: same query, different execution, different performance
Re^4: same query, different execution, different performance (substr)
by tye (Cardinal) on Feb 14, 2012 at 17:58 UTC

    No, it just depends on the query optimizer. Some databases have query optimizers that know how to use an index when told "LIKE 'blah%'". Some database have query optimizers that know how to use an index when told the equivalent thing using SUBSTR(). Some databases have optimizers that know how to do both. Some neither.

    - tye        

      Some databases have query optimizers that know how to use an index when told "LIKE 'blah%'".

      I think most query optimizers will know to use an index on "LIKE 'blah%'", but not when the query plan is determined at prepare time (for those sorts of databases), and the query optimizer is told "LIKE ?" and only later is given the argument "blah%".

      Some database have query optimizers that know how to use an index when told the equivalent thing using SUBSTR().

      A quick test with Oracle (update: and Sybase, and from what I recall, Informix) seems to show that it doesn't know to use an index with SUBSTR, and some quick googling on Postgres seems to imply that a function based index would be required there also. I'm not saying there's no database smart enough to use a regular index on a column for a substring search, I just haven't seen it yet.

        (Straying a bit from the OP, but this is fun, no? And anyway, I rather expect this may be interesting/useful for him too).

        An alternative (in postgresql) would be to use regex-index, which can be used when the submitted search-string or regex is anchored:

        select count(*) from azjunk6; -- 1 million rows random data: count --------- 1000000 (1 row) -- without index: select * from azjunk6 where txt ~ '^car[sz]'; txt + ---------------------------------------------------------------------- +------------ carsxbutsvamedynximrftmimgtzirtuorik lunamb qpjvwmixlxpmcu mm rzotj +jnfxr syfrj carzfhndjznvpgcpwqb fp bqpljspqqpzfbbswefzs pjoocqztqkjxyvbr qalcfzme +bezz ftmyi carziicmi zzzvt beqsupgdwkhdg luvvmhhay bj b r soaiyfftiqgq hs brdzaf +dztmtvfvrdn carziogaizohcqcphs ksucyeod q yvfallob pctvmwplm igzsqalyy dqsjpiikx +wyyxesenbeq carzw rcfwlqcweao jzeyxkchgc g vyvujtbsbeiewj inuelmldsa mpjevzmo pc +pwi kfajug carzxrk qyk palimcwokbw hbdcsmxehcsnrop prrokygyi ssngegzksrzvged cu +oxr yozt ca (6 rows) Time: 1147.420 ms -- now make a text_pattern_ops index: create index azjunk6_text_pattern_ops_idx on azjunk6 (txt text_patter +n_ops); Time: 7282.579 ms -- with index: select * from azjunk6 where txt ~ '^car[sz]'; txt + ---------------------------------------------------------------------- +------------ carsxbutsvamedynximrftmimgtzirtuorik lunamb qpjvwmixlxpmcu mm rzotj +jnfxr syfrj carzfhndjznvpgcpwqb fp bqpljspqqpzfbbswefzs pjoocqztqkjxyvbr qalcfzme +bezz ftmyi carziicmi zzzvt beqsupgdwkhdg luvvmhhay bj b r soaiyfftiqgq hs brdzaf +dztmtvfvrdn carziogaizohcqcphs ksucyeod q yvfallob pctvmwplm igzsqalyy dqsjpiikx +wyyxesenbeq carzw rcfwlqcweao jzeyxkchgc g vyvujtbsbeiewj inuelmldsa mpjevzmo pc +pwi kfajug carzxrk qyk palimcwokbw hbdcsmxehcsnrop prrokygyi ssngegzksrzvged cu +oxr yozt ca (6 rows) Time: 12.524 ms --> 100x faster

        (It can be handy to have both a 'normal' btree index *and* such a text_pattern_ops regex index.)

        See also: PostgreSQL index opclasses

        You can get another interesting indextype from pg_trgm, a postgresql extension. This will give you not indexed regexen but indexed trigrams: PostgreSQL pg_trgm extension. (disadvantage: large index-size)

        And FWIW: in 9.2devel, there is work ongoing to make it possible to combine the two: regexed trigram indexes...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://953726]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2014-04-19 02:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (475 votes), past polls