Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: SQLite: INSERT into a unique column and retrieve rowid

by ibm1620 (Hermit)
on May 04, 2024 at 16:14 UTC ( [id://11159274]=note: print w/replies, xml ) Need Help??


in reply to SQLite: INSERT into a unique column and retrieve rowid

Using the above suggestions (thank you!), I rewrote the central subroutine as follows:
sub insert_unique( $table, $column, $value ) { state %sths; if ( !exists $sths{$table} ) { $sths{$table}{SELECT} = $dbh->prepare("SELECT id FROM $table WHERE $column = (?)") +; $sths{$table}{INSERT} = $dbh->prepare("INSERT INTO $table ($column) VALUES (?) RET +URNING id"); } my $rowid; my $sth = $sths{$table}{SELECT}; $sth->execute($value); my $rows = $sth->fetchall_arrayref( {} ); if ($rows->@*) { $rowid = $rows->[0]->{id}; } else { $sth = $sths{$table}{INSERT}; $sth->execute($value); $rows = $sth->fetchall_arrayref( {} ); $rowid = $rows->[0]->{id}; } return $rowid; }
Which works.

It does seem to me a lot of code to accomplish a seemingly basic requirement. I'm implementing UNIQUEness, for one thing. But it works, and it's pretty fast adding 60,000 songs. (I didn't notice much difference whether or not the column was indexed.)

Replies are listed 'Best First'.
Re^2: SQLite: INSERT into a unique column and retrieve rowid
by hippo (Archbishop) on May 04, 2024 at 22:46 UTC
    It does seem to me a lot of code

    It could be streamlined for sure, particularly if you make the database enforce the uniqueness constraint (as in the OP) and just trap the exceptions. But really the worrying part is that $table and $column are SQL injections just waiting to happen.

    Also the fact that you've used a state hash to hold the prepared statement handles means that these variables only have relevance on the first invocation, which is an unusual situation. eg. call it once on table foo and then again on table bar and the second one will still use table foo. That's confusing at best. (edit: ignore this part)


    🦛

      Regarding SQL injection -- had to look it up. Is my use of variables $table and $column in those prepared SELECTs necessarily dangerous? I can see how they could be if an enduser were prompted for them, but that's not how I intend to populate them. It's for a specific application with specific table and column names that are hardcoded into the program. As I say, I'm very much a novice at SQL apps. Maybe there's more to this than I'm aware of ... ? Thanks.
        I can see how they could be if an enduser were prompted for them, but that's not how I intend to populate them.

        Proverbially the road to Hell is paved with good intentions. The trouble is that it becomes harder and harder to keep track of all the possible routes through your code as it grows and the chances, however slight, of allowing external input to the variables can exist.

        At the very least you should sanitise these variables as close to their point of use as possible, ie: within the subroutine. So, something like this:

        sub insert_unique( $table, $column, $value ) { state %sths; if ( !exists $sths{$table} ) { die "Bad table '$table'" unless $table =~ /^[a-z0-9_]+$/; die "Bad column '$column'" unless $column =~ /^[a-z0-9_]+$/;

        Better, use Carp::confess instead of die. Better still, use a hash of acceptable tables to further nail things down:

        sub insert_unique( $table, $column, $value ) { use Carp; state %sths; state %good_table = map { $_ => 1 } qw/foo bar baz/; if ( !exists $sths{$table} ) { confess "Bad table '$table'" unless $good_table{$table};

        By doing this, and the same for $column, you are limiting the ability of an external agent (or your own fat fingers) to operate on parts of the database to which there should not be access.

        Oblig xkcd


        🦛

      I don't understand your comment about the state hash. I copied this subroutine into my actual program, which calls insert_unique() repeatedly for six different tables, and it worked correctly.

        You are quite right. I had somehow missed that you were keying the hash on the table and testing for this one level down. Mea culpa.


        🦛

Re^2: SQLite: INSERT into a unique column and retrieve rowid
by Marshall (Canon) on May 06, 2024 at 14:08 UTC
    I don't know what "it's pretty fast adding 60,000 songs" means. I would add the line $dbh->begin_work; before the loop that puts the 60,000 things into the DB. And the line $dbh->comitt; after the loop is over. This will run all 60,000 inserts as a single transaction. There should be a very noticeable decrease in execution time.
      I thought you were saying that the addition of an index to the table would dramatically increase the time it took to load the table. Without begin_work / commit, it took 23 seconds to load the table with an index, and 20 seconds without one. However, running it as a single transaction as you suggest, those numbers drop to 0.77 and 0.72.

      I haven't run any comparisons to see how much the index benefits retrieval, however...

        I meant that indexing a column(s) will cause inserts into the table to take longer because the index structure needs to be updated as extra work. The goal would be to decrease retrieval times. I don't think that indexing will matter much in your case, but some performance tests would tell the story. Your performance numbers sound about right to me. Your code was running one transaction per insert. Making one transaction for 60,000 inserts saves a lot of time!!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11159274]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2025-07-11 02:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.