http://www.perlmonks.org?node_id=135057

deprecated has asked for the wisdom of the Perl Monks concerning the following question:

<!- i use postgres. you should too. --> hi, monks.

i've been using dbi a whole lot lately. im writing usually upwards of 700 lines of code per week. anyone who's coded with dbi knows that its kind of hard to write code that isnt fraught with unmaintainable peril.

i guess the problems come in because of all the references and abstractions used.

here are a few examples of stuff I find difficult to code in dbi in a manner that is clearly readable by novice programmers (many of the people I work with) and also not so cluttered that intermediate programmers can read it without too much pain.

inserts with $sth

my $sth = $dbh -> prepare("insert into table1 (col1, col2, col2) value +s (?, ?, ?)");
my beef here is that the line winds up being so long. risacher and i have discussed style and spacing. i used to think that horizontal space wasnt such a premieum, but vertical space was. this was when i was using a 21" monitor. i'm now using a 15" laptop screen. the above is a very simple query, and still comes out at 84 characters. you can see how cramped lines this long can be in this screen sh ot.

This problem is even further complicated by using descriptive variable names. Let's look a little further at this:

# do we need an insert or an update? my $row_checker = $dbh -> prepare("select count(col1) from table1 wher +e col1 = ? and col2 = ? and col3 = ?"); my $inserter = $dbh -> prepare("insert into table1 (col1, col2, col2) +values (?, ?, ?)");my $upda ter = $dbh -> parepare("update table1 set col1 = ?, col2 = ?, col3 = ? + where col1 = ? and col2 = ? and col3 = ?");
as you can see, this gets ugly in a hurry. so i've been doing two things when this starts to bother me and set off my ugly-sensors.
my $row_checker = "select count(col1) from table1 where col1 = ? and c +ol2 = ? and col3 = ?"; $row_checker = $dbh -> prepare($row_checker); # or ... my $updater = $dbh -> prepare(qq{ update table1 set col1 = ?, col2 = ?, col3 = ? where col1 = ? and col2 = ? and col3 = ? });
i still don't really like either of these. the first one is more compact for smaller queries, but still doesnt help for larger queries. the second one is okay, but i dislike having the }); sitting all by itself on another line. but it is much better than the original, imho.

pulling out data from the database

so lets say you want to pull a single tuple out of the database. this is something most of us do quite frequently. even on a huge database like the one at work (terabyte scale), i still do want one tuple pretty frequently. here's what the dbi pod would have you do:

my $rowRef = $dbh -> selectrow_arrayref("select col1 from table1 where + col1 = col2 and col2 != '$ someval'"); my @row = @{ $rowRef }; #or ... my $tuple = shift @{ $rowRef }; #or ... my $tuple = @{ $rowRef }[0];
there are some problems with this for people who use warnings and/or strict. if your reference comes back empty (because your query failed), your whole script dies with the ever popular "cant use an undefined value as an arrayref at oops.pl line 100", which sucks. i dont want my program to crash because my query failed, i'd rather trap the error and report it to the user (and/or the developer)! instead, i've been using this:
my ($tuple) = map { @{ $_ } } $dbh -> selectall_arrayref("select col1 +from table1 where col1 = co l2 and col2 != '$someval'");
of course you can use trickery like $query or selectall with a multiline qq{}, but i like the above method because then I can do:
if ($tuple) { # succeeded...
and i dont have to play around with if ref $tuple ... which I find rather irritating .

now, that having been said, i generally prefer to not use $dbh when I can use a statement handle instead.

my $next_col1 = "select distinct(col1) from table1"; $next_col1 = $dbh -> prepare($next_col1); $next_col1 -> execute(); my @col1s = map { @{ $_ } } @{ $next_col1 -> fetchall_arrayref() }; foreach my $col1 (@col1s) { ...
in general, if youre going to be executing a query a lot of times, a $sth is a better bet than just using $dbh. its faster, and you let the DBD module take care of quoting and whatnot. it also gives you more meaningful errors if you call it with too many bind values or not enough.

i'm sure I have more syntactic tricks up my sleeve that i'm not remembering right now. the purpose of this post, however, was to see if anyone was doing anything with dbi's refs of refs of refs of (gack!)... and making it easier to read and/or write.

happy new year and stuff.
brother dep.

--
Laziness, Impatience, Hubris, and Generosity.

Replies are listed 'Best First'.
Re: dbi style questions (code, discussion)
by talexb (Chancellor) on Dec 29, 2001 at 08:02 UTC
    All I can suggest is comment, comment, comment.

    But let's look at some code. Your first snippet is

    my $sth = $dbh -> prepare("insert into table1 (col1, col2, col2) value +s (?, ?, ?)");
    This wraps a little ugly; I'd change it to
    my $sth = $dbh->prepare( "insert into table1 (col1, col2, col2) values (?, ?, ?)");
    That way, the entire SQL statement is on a line by itself. A little lower you have
    my $updater = $dbh -> prepare("update table1 set col1 = ?, col2 = ?, c +ol3 = ? where col1 = ? and col2 = ? and col3 = ?");
    I'd change that to
    my $updater = $dbh -> prepare( "update table1 set col1 = ?, col2 = ?, col3 = ? \ where col1 = ? and col2 = ? and col3 = ?");
    That makes the variables all line up (I know, it's not to everyone's taste.) Although I ponder at the logic of that statement .. if the row values are already set to those values, what good is setting them to those same values? Usually when I do an update, I get an id value for the row that I'm going to change, and do something like
    my $updater = $dbh->prepare( "update table1 set col1 = ?, col2 = ?, col3 = ? where id = $ID");
    Generally I believe you want to use placeholders like ? when you are going to be inserting or updating many rows. For one-offs it isn't really necessary.

    In general style terms, I try to stick with an 80 character limit on line lengths. I may have a big monitor now, but maybe later I'll have to look at my code on my client's 14" screen in character mode. Then I be lookin' stoopid.

    Make your code as readable as possible -- if you can't read your own code in three months time, what chance does anyone else have?

    --t. alex

    "Excellent. Release the hounds." -- Monty Burns.

(dkubb) Re: (1) dbi style questions (code, discussion)
by dkubb (Deacon) on Dec 29, 2001 at 11:34 UTC

    I find the messy-ness that you speak of to be a common occurence when other languages are intermixed inside perl. Compared to normal perl, SQL or even HTML breaks up the flow and can cause confusion with novice and experienced programmers alike.

    However, I do think it's possible to write clean DBI code, and simplify on some of the methods you use to fetch data from the database handle.

    I agree with you splitting up the SQL query onto multiple lines is a better idea. I also capitalize the SQL keywords/commands, and lower case the columns and table names:

    my $sth = $dbh->prepare(q{ INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?) });

    Style is very subjective, I'd say work what's best for you, and most importantly, be consistent. The novice programmers you work with will learn any style you choose, as long as you/they aren't changing your mind every few weeks =)

    Along with SQL formatting conventions, I follow a few simple naming conventions when working with DBI:

    • All statement handles are prefixed with $sth_, as in $sth_get_customers if I am using more than one, otherwise I just use $sth.
    • If I am using more than one database handle, then I prefix all of them with $dbh_, as in $dbh_accounting. Otherwise I just use $dbh.

    All of the above conventions won't really do much good if you have to jump through hoops to get the data out of the "fetch" commands. =) Luckily, there are simpler ways to pull data from the database.

    IMHO, the best way to pull a tuple from a data source is with DBI::selectrow_array:

    my $tuple = $dbh->selectrow_array(q{ SELECT col1 FROM table1 WHERE col1 = col2 AND col2 = ? LIMIT 1 }, {}, $someval, );

    I agree with your position that using statement handles is faster (especialy with some databases that can prepare a query plan, like Oracle) when you are querying over and over, when compared to using DBI's utility methods. But, there is one exception that alot of people may not know: you can prepare a statement and pass that off to a utility method, not losing any speed, and cutting a few lines of code out. Consider the following:

    my $sth_get_id = $dbh->prepare(q{ SELECT id FROM contact WHERE email = ? LIMIT 1 }); foreach my $email (@emails) { my $id = $dbh->selectrow_array($sth_get_id, {}, $email); print "$email -> $id\n"; }

    All of DBI's utility methods can accept statement handle rather than a straight SQL query as the first parameter.

    Also, if you want to fetch multiple rows, but want only the value from the first column of each row, DBI has a built in method called selectcol_arrayref that will do exactly that:

    my $emails = $dbh->selectcol_arrayref(q{ SELECT email FROM contact WHERE id BETWEEN ? AND ? }, {}, 1, 100, ); foreach my $email (@$emails) { ...
Re: dbi style questions (code, discussion)
by markjugg (Curate) on Dec 29, 2001 at 21:23 UTC
    A couple tips:

    First: Regarding ugly insert and update statements, I solve this problem by using DBIx::Abstract with it's insert() and update() functions:

    use DBIx::Abstract; # get started using existing $DBH handle my $db = DBIx::Abstract->connect($DBH); # insert from a hash, auto-quoting along the way $db->insert('table_name',\%hash_ref); # update from a hash, auto-quoting for you $db->update'table_name',\%hash_ref,'item_id = 3');

    Very handy. I don't really use the rest of the DBIx::Abstract interface. I prefer straight DBI most of the time. One plus to this system: DBIx::Abstract can handle passing in constants that don't get quoting, like CURRENT_DATE, for example. See it's docs for that exact syntax. One minus to this system: It doesn't use DBI's prepare_cached() method, one of my new favorite DBI tricks for an easy performance boost.

    Second: In many cases you can combine the the flexibility of $sth methods, like using prepare_cached() and placeholders, with the simplicity of the $DBH calling style, like this:

    $sth->prepare_cached("SELECT * from t where id = ?"); $DBH->selectall_arrayref($sth, {}, @bind_values);

    -mark

Re: dbi style questions (code, discussion)
by BazB (Priest) on Dec 29, 2001 at 15:13 UTC

    At this point, I've only used a combination of Tcl and SQL to query a database.
    IMHO, I think it's irrelevant which languages you're using, things are really going to get confusing if you've got a mix of scripting, markup and SQL unless you format everything to the point of excess.

    Have a look at Philip Greenspun's SQL for Web Nerds, particularly the style section.
    It's all Oracle8i stuff, but it should apply to anything with SQL.

    When I've coded scripts with large SQL queries, I often put the query into a string:

    my $query = "SELECT post, post_time, user_id, post_id FROM my_table WHERE post_time > sysdate + 1 ORDER BY post_id;";
    Then query the database with the command required (can't do it in Perl yet :-) with $query in that command instead of a lump of SQL and scripting/Perl mashed together.

    Even if you don't like my idea of having the query seperate, or it's impossible, then just get used formatting the queries so that they stand out from the Perl and are readable as SQL.

    Hope that helps,

    BazB.

    Update: Seems like a slight more Perl-y way of doing this might be to use HERE documents, as described in this node by steves.

    Further update: fixed the link to SQL for Web Nerds

Re: dbi style questions (code, discussion)
by edebill (Scribe) on Dec 29, 2001 at 22:39 UTC

    You're right that it gets ugly fast. Like you, I've been doing a lot of DBI work, and I've found that the bigger the SQL statements, the uglier it gets. Here's what I came up with

    Starting with the "everything in a line" technique, but showing more columns/values

     my $bsql = "insert into messages(messageid, quoteid, subject, body, viewed, toid, fromid, timestamp_c, timestamp_m) values($next_messageid, $quoteid, $q_subject, $q_body, 0, $toid, $fromid, CURRENT TIMESTAMP, CURRENT TIMESTAMP)";

    How long does it take to make sure all your columns names and values line up? What happens the next time you need to add a column? I get lost pretty quick looking at that style.

    Now, let's try it in a vertical layout:

    my $sql = "insert into messages ( messageid, quoteid, toid, fromid, subject, body, viewed, timestamp_c, timestamp_m ) values ( $next_messageid, $quoteid, $toid, $fromid, $q_subject, $q_body, 0, CURRENT TIMESTAMP, CURRENT TIMESTAMP)";

    OK, so you have a little better hope of adding/removing columns without breakage. And people are generally better at counting lines than counting words (for figuring out exactly which column a value goes with). C-k C-k C-y in emacs (or the vi equivalent) can move columns,values around very efficiently. I definitely like this one better (even though it grows off screen pretty quick).

    Now, for a third way. How about something like:

    my $msgdata = { messageid => $next_messageid, quoteid => $quoteid, toid => $toid, fromid => $fromid, subject => $q_subject, body => $q_body, viewed => 0, timestamp_c => "CURRENT TIMESTAMP", timestamp_m => "CURRENT TIMESTAMP" }; my $sql = "insert into msg ("; my $values = ") values ("; my $started = 0; foreach my $datum (keys(%$msgdata)) { if (length($msgdata->{$datum}) > 0) { if ( $started ){ $sql .= ",\n $datum"; $values .= ",\n $msgdata->{$datum}"; } else { $started = 1; $sql .= " $datum"; $values .= " $msgdata->{$datum}"; } } } $sql .= $values;

    Now, before you reach for that barf bag, take another look. This third way does some things the first 2 don't.

    • It checks to make sure there's something there. If one of the values is undefined, it will still generate valid SQL.
    • You always know exactly which column a given value goes with.
    • In fact, everything after creating the hash can be put into a nice little sub so you don't have to look at the foreach.
    It takes some getting used to, but I find the third option a lot more maintainable.

    Also notice I'm always creating a scalar to hold the SQL, never putting it right into the prepare(). That's to make it easier to print(). I find that I have to do that so often I might as well just plan on it, no matter which style I'm making the SQL with.

    SQL is an ugly language (why does update have nice name=value syntax, while insert doesn't?) and it just gets uglier as you add code to $dbh->quote() all your values and wrap things in the eval{} blocks that make your code robust enough to handle a failed transaction. Automatically generating the SQL at least cuts out some typo problems.

      Nice idea. I've some simple improvements to make it more human friendly.
      # some sample data. my %msgdata = ( messageid => $next_messageid, quoteid => $quoteid, body => $q_body, viewed => 0, timestamp_m => "CURRENT TIMESTAMP" ); # you can do a sort here if you care about the order # of the keys. my @keys = keys %msgdata; # uncomment this if you want to exclude all pairs # that have no value. #@keys = map {defined($msgdata{$_})? $_ : ()} @keys; # get all my values in the same order as my keys. my @values = @msgdata{@keys}; # replace any undefined values with "NULL". # Comment this out if you're excluding these above. # Note: if you're using $dbh to quote your values # then use $dbh->quote($_) for the true option @values = map {defined($_)? $_ : "NULL"} @values; # my sql statement. my $sql = "INSERT INTO msg ( " . join (",\n", @keys) . ") " . "VALUES " . join (",\n", @values); # or with DBI: (but not so good for printing) # you don't need to use the $dbh->quote above if # you use this. my $sql2 = $dbh->do( "INSERT INTO msg ( " . join (",\n", @keys) . ") " . "VALUES " . substr(("?,\n" x @values), 0, -2), undef, @values);
      I like DBI. But a friend who doesn't like his SQL to mess up his code puts all of the queries into a module and then calls them functionally. eg:
      insert($table, @keys, @values); # or possibly insert($table, $data_ref); # and definately $data_ref = select($table, $conditions_ref, $desired);
      That works as well of course.

      Jacinta

        # my sql statement. my $sql = "INSERT INTO msg ( " . join (",\n", @keys) . ") " . "VALUES (" . join (",\n", @values) . ")";

        I like it. Using join() this way simplifies the SQL generation a lot.

        Unfortunately, there's no real help for manually $dbh->quote()ing values that need it - we use DB2 at work, and quoting a number gives a SQL error (grrrrrrrr). That's my #1 complaint about the thing.

        We'll probably move to something between what your friend does, and the inline code. Generate the actual SQL in a module, but maintain control of it's execution inline - otherwise you're borked with transactions, and I hate passing $dbh's around all over the place.

Re: dbi style questions (code, discussion)
by guha (Priest) on Dec 29, 2001 at 15:39 UTC
    I mostly use something like

    my $sqlstmt = <<SQL_A; select count(col1) from table1 where col1 = ? and col2 = ? and col3 = ? SQL_A
    If the SQL statement gets really big I would consider wrapping it into a sub, aka

    my $sqlstmt = def_sql_a; sub def_sql_a { return <<SQL_A; select count(col1) from table1 where col1 = ? and col2 = ? and col3 = ? SQL_A }
    This is also consistent with hiding the gory details in a sub when creating "dynamic" SQL statements.
Re: dbi style questions (code, discussion)
by uwevoelker (Pilgrim) on Dec 29, 2001 at 23:30 UTC
    At first I would like to thank markjugg for the tip with DBIx::Abstract.

    I usually use the other INSERT syntax (similar to UPDATE, with SET).

    Here comes an example:
    sub db_store_event { my $id = shift || 0; my $ret = $id; if ($id > 0) { # update my $hash = db_get_event($id); my $event = impand_event(); # konvertiere Uhrzeit und Datum my @update = (); foreach my $key (keys %$hash) { if (defined $event->{$key} and ($hash->{$key} ne $event->{ +$key})) { push @update, $key.'='.$dbh->quote($event->{$key}); } } if (scalar @update > 0) { db_do("UPDATE ".$config->{event_table}." SET ".join(', ', +@update)." WHERE id=$id"); } } else { # insert my $event = impand_event(); # konvertiere Uhrzeit und Datum my @dat = (); foreach my $key (keys %$event) { push @dat, $key.'='.$dbh->quote($event->{$key}) if $event- +>{$key}; } my ($err, $sth) = db_sth("INSERT INTO ".$config->{event_table} +." SET ".join(', ', @dat)); $ret = $sth->{'mysql_insertid'} or die "no db_id"; } return $ret; }
    The routine impand_event() konverts the cgi parameter to an hashref with keys like the fields in the database and does some conversion (for date and time).
    sub impand_event { my ($datum, $zeit); my %event = (); foreach my $key ($cgi->param) { if ($key =~ m/event\.(\w+)/) { $event{$1} = $cgi->param($key); + } } $datum = $event{datum_beginn}; if ($datum) { $event{datum_beginn} = date_to_sql($datum); } $datum = $event{datum_ende}; if ($datum) { $event{datum_ende} = date_to_sql($datum); } $zeit = $event{zeit_beginn}; if ($zeit) { $event{zeit_beginn} = time_to_sql($zeit); } $zeit = $event{zeit_ende}; if ($zeit) { $event{zeit_ende} = time_to_sql($zeit); } return wantarray ? %event : \%event; }
    Sorry for the german comments, but these are real lines of code.

    uwevoelker
Re: dbi style questions (code, discussion)
by Anonymous Monk on Dec 29, 2001 at 20:37 UTC
    I do a lot of DBI programming on a daily basis and I still haven't come up with something I am 100% happy with. Most of the SQL statements I use are long multiline ones, which I like the form:
    my $sth = $dbh->prepare(q{ SELECT x, y, z FROM table1, table2 WHERE x = ?, y = ? ORDER BY z });
    I think making the SQL commands uppercase helps them stand out from the code a bit better. If it's a shorter statment I try to squeeze it on to one line. I also use prepared statements for most everything, apart from 1 shot updates, deletes etc.
    $dbh->do('delete from temp where x = ?', undef, $x);
    According to the DBI docs, using bind_columns is the way to go, it means you can write code like:
    $sth->execute; $sth->bind_column(\$res); $sth->fetch;
    $res would be undef if there was an error.

    Another alternative is:

    $sth->execute; unless (@cols = $sth->fetchrow_array) { # error }
    Hope this gives you a few ideas.
      Dammit, forgot to log on. Doh!
Re: dbi style questions (code, discussion)
by IlyaM (Parson) on Dec 30, 2001 at 01:43 UTC
    One possible approach is don't use DBI at all. Well, don't use it directly but use abstractions layers which can hide ugly SQL queries. Personally since I'm big OO fan I just want to use persistent objects and have all SQL queries hidden from me (at some extend). If you find such approach interesting check very nice overview of Perl Object-Oriented Persistence modules.

    --
    Ilya Martynov (http://martynov.org/)

Re: dbi style questions (code, discussion)
by Ryszard (Priest) on Dec 30, 2001 at 11:35 UTC

    If performance isnt an issue, one trick i use is to read the sql in from a file.

    another trick i use to line up my sql and make it look nice in the code (when i embed it) is concatenation

    my $sql .= "rest of statement"

    I also package up my dbi stuff into a couple of subbies, one for selecting, and one for insert, update, delete, and do all the appropriate error checking (ie DB alive, uname/pwd ok et al). I also pass all the params as anon hashes.

    Its not amazing code, buts its functional, and works for me.

    It also means, once i've written my dbi subbies, i dont need to go about writing dbi stuff again, just remember the parameters that must be parsed.. :-)

      I write tons of SQL each week using Perl/DBI. 99% of the SQL I put in here documents, ala:

      my $sql = <<"END_SQL"; SELECT name, address, phone_number, order_id FROM customer, customer_order, order_item WHERE blah blah blah END_SQL

      That makes it readable and it looks pretty nice when you log it, etc. Here documents: another reason Perl rocks.