http://www.perlmonks.org?node_id=368366


in reply to Re: Backslash and Underscore problem with DBI and PostgreSQL.
in thread Backslash and Underscore problem with DBI and PostgreSQL.

The code I show above is from a couple years ago when I first began using SQL and everything I've found in the postgresql mailing lists and docs indicated that if I wanted to do a case insensitive search, I needed to use the regex operators of ~* or ~~* and this is why you see the instructions above.

Using an '=' isn't case insensitive but I realize that I could and probably should be using lower() as in SELECT uid FROM accounts WHERE lower(username) = lower(?).

Is this the correct and fastest way to handle this? I realize my questions in this thread are really more geared toward PostgreSQL than perl, but it involves the perl DBI to some degree. And I know my question is a bit rudimentary to DBAs - but I'm not one and your responses are always greatly appreciated. :)

Replies are listed 'Best First'.
Re^3: Backslash and Underscore problem with DBI and PostgreSQL.
by Zaxo (Archbishop) on Jun 21, 2004 at 04:50 UTC

    Yes, forcing to lowercase and testing equality is much better than pattern matching.

    A username is probably intended to be a unique key, so if you want case insensitive matching, you should take steps to make sure that uniqueness is enforced in a case insensitive way. A rule on insert and update should be a nice way to do that in pg.

    The trouble with pattern matching is that, as you found, it breaks uniqueness of a key.

    After Compline,
    Zaxo

      Yeah, uniqueness constraints are proper going in, but somehow this has escaped me in my SELECT statement all this time that I've been using the code. I've read the DBI book by O'Reilly a number of times as well as several PostgreSQL books - but it had never occurred to me that there was an alternative to using ~* and ~~* to achieve what I was doing.

      Thank you (Zaxo and blokhead) for taking the time to answer my relatively idiotic question. :)
Re^3: Backslash and Underscore problem with DBI and PostgreSQL.
by mpeppler (Vicar) on Jun 21, 2004 at 07:15 UTC
    Is this fast:
    select ... from .. where lower(username) = lower(?)
    This may or may not be fast - it depends on how the query engine runs, and whether the optimizer can use an index when you apply a function on a column. It may work fine with Postgres, but I know that Sybase will not generate a good query plan with such a query.

    BTW - If you use "LIKE ..." you can normally escape any potential wildcard characters yourself. Sybase uses square brackets to do the escaping (and can use alternate escape characters as well), so that you could write:

    SELECT ... FROM ... where foo like ? ESCAPE '\'
    and then pass "foo\_" as the search parameter and not get wildcard expansion on the underscore. Check the Postgres docs for similar functionality.

    Michael

      This may or may not be fast - it depends on how the query engine runs, and whether the optimizer can use an index when you apply a function on a column.

      Probably premature optimization unless there are a lot of users, but PostgreSQL support indexes on expressions:
      http://www.postgresql.org/docs/7.4/interactive/indexes-expressional.html.

      As always, "explain" can be used to examine the optimizer's query plan.

      I didn't expect lower() to have much of a speed impact and the query plan shows that the difference between doing a straight '=' or '~~*' and lower() is only 44msec versus 54msec.

      My indexes are pretty tight and it's a well (and often) maintained database. With millions of records, it has to be. :)

      The reason I didn't use LIKE is that I'm trying to allow users to create usernames that contain any characters they want without having to manually escape dozens of special characters myself.