Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^6: Documentation of REGEXP support in DBD::SQLite?

by erix (Prior)
on Nov 16, 2024 at 03:44 UTC ( [id://11162749]=note: print w/replies, xml ) Need Help??


in reply to Re^5: Documentation of REGEXP support in DBD::SQLite?
in thread Documentation of REGEXP support in DBD::SQLite?

Question: how are substrings of size 1 or 2 handled? Are they just ignored?
explain analyze select * from azjunk7n where txt ~ 'ba'; -- '~' means: consider regex +index QUERY PLAN + ---------------------------------------------------------------------- +----------------------------------------------- Seq Scan on azjunk7n (cost=0.00..267879.16 rows=707066 width=85) (ac +tual time=5.413..9163.252 rows=897633 loops=1) Filter: (txt ~ 'ba'::text) Rows Removed by Filter: 9102367 Planning Time: 0.360 ms Execution Time: 9189.173 ms (5 rows) Time: 9190.029 ms (00:09.190)

Nine seconds. Because, of course, if there are too many hits (here: 897633), the system switches to SeqScan - after all, a sequential scan is the fastest way to access many rows. Faster would've been: where position('ba' in txt) > 0 which would SeqScan in 3 seconds; but position() doesn't allow regexen.

Replies are listed 'Best First'.
Re^7: Documentation of REGEXP support in DBD::SQLite?
by LanX (Saint) on Nov 16, 2024 at 07:52 UTC
    Thanks :)

    > Because, of course, if there are too many hits (here: 897633), the system switches to SeqScan

    This doesn't decisively answer the question if trigrams are indexed by bigrams... 🤔

    What if you search a bigram which is - by design- only in few rows?

    Or an and combination of multiple bigrams?

    And to be sure, please use the like '%ab%' form again.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      (For decisive answers, read the postgres code ;))

      And to be sure, please use like '%ab%' again.

      I can't get bigrams to respond quickly; also not when there is only one matching value; with this data it will always Seq scan (sometimes with parallel workers: just under 1 second).

      In postgres, 'LIKE' doesn't allow regex (although its simple pattern search can sometimes use the trigram or btree index). Postgres uses the tilde for regex search (~ case sensitive, ~* case insensitive).

        > I can't get bigrams to respond quickly

        Hence they are not indexed.

        > although its simple pattern search can sometimes use the trigram or btree index

        That's why I wanted LIKE, to be sure the trigram optimization can be chosen

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11162749]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (1)
As of 2026-05-09 13:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.