Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

DBI recipes

by gmax (Abbot)
on Aug 17, 2003 at 18:47 UTC ( [id://284436]=perlmeditation: print w/replies, xml ) Need Help??

Programming with the DBI becomes a matter of habit. You may choose to code directly with the DBI rather than using on of the many wrappers available on CPAN because of efficiency concerns, or because you are dealing with legacy code, or simply because you want to have a grip at the core of things. Whichever reason for using the DBI directly, the time comes when you have to face one of the simple problems listed here. If you were looking for answers, this is the place to go. If you found the answer on your own, let's compare notes. Either way, enjoy the reading.

Table of contents

Checking for an existing table

If you want to initiate action on a table, or a group of tables from a specific database, it would be better to make sure that such table exists. Sometimes you don't need to worry, since your scripts only apply to a well designed and never-changing database. Sometimes, though, you are not so lucky, or you simply need to check if a table - perhaps a temporary one - was already created.

Standard SQL does not have a way to tell you that, even though some dialects provide a useful idiom (MySQL allows CREATE TABLE IF NOT EXISTS and DROP TABLE IF EXISTS). If your database of choice doesn't give you such facility, or you want to create something portable to more than one DBMS, this technique could be handy.

sub table_exists { my $db = shift; my $table = shift; my @tables = $db->tables('','','','TABLE'); if (@tables) { for (@tables) { next unless $_; return 1 if $_ eq $table } } else { eval { local $db->{PrintError} = 0; local $db->{RaiseError} = 1; $db->do(qq{SELECT * FROM $table WHERE 1 = 0 }); }; return 1 unless $@; } return 0; } if (table_exists( $dbh, "employees") { print "it's there!\n"; } else { print "table not found!\n"; $dbh->do( $create_query ); }

This routine uses DBI's tables method as a primary way of checking. Unfortunately, not every DBMS answers to tables in a consistent way. Therefore, I use a double standard. If tables returns a non empty array, then I test each element until I find one matching the table being searched. If the array is empty, it could be either that the database is empty, or that the DBMS fails to report. In this case, just to be on the safe side, I use the second method, trying to issue a SELECT statement with a false WHERE clause. If this method returns an error, it means that the table does not exist. Using eval and localizing the database handler's attributes ensures some degree of safety.

I can't guarantee that it will work with every database driver, but it is the best I could think of.

Inserting records

There are basically two ways of inserting records through the DBI. The easiest one is using the do method, passing the whole query, without worrying about parameters. Basically, the same query that you'd type in a native database client would also go through a do method. This is something you should do if you have to insert just one record. If you need to insert many of them within one loop, then it is strongly advisable to use prepare//execute.

Using prepare and /execute for insertion means (most of the times) using placeholders, which is the recommended method of dealing with this problem. Placeholders are efficient and they guarantee that the values are properly quoted. The only thing sort of unpleasant with them is that you can get confused about how many placeholders to enter or about the order of the values to submit. In the first case, don't worry, since the DBI will complain about a wrong number of placeholders. In the second case, it could become tricky, if you swap things like first and last name. But you don't have to rely only on your patience and accuracy. You can use some idioms to fortify your query.

Creating an insertion query

Your first priority when inserting records is creating an insertion query that has the right list of columns and a matching number of placeholders. To ensure that you are always referring to the right list of fields, it is useful to store such list into an array.

my $create_query = qq{ create table employees ( id int not null primary key, name char(20) not null, salary int ) }; $dbh->do($create_query); my @fields = (qw(id name salary)); my $fieldlist = join ", ", @fields; my $field_placeholders = join ", ", map {'?'} @fields;

In this example, there are only three columns, but the idiom would work the same way if we had 100 or more. From the @fields array it is easy to create a string of fields to be used either in an INSERT or a SELECT statement. In this case, the INSERT query is

my $insert_query = qq{ INSERT INTO employees ( $fieldlist ) VALUES ( $field_placeholders )}; my $sth= $dbh->prepare( $insert_query ); # $insert_query = "INSERT INTO employees ( id, name, salary ) # VALUES ( ?, ?, ? )";

Now we can insert some records

Inserting scalars

The easiest insertion is when using scalars as arguments for execute.

my ($id, $name, $salary) = (5, 'Sal', 4500); $sth->execute($id, $name, $salary);

It is obvious that executing a query with scalars is only useful when the scalars are updated. For example, within a loop, we may call a function that fills the values with appropriate contents, so that we can call execute with valid items.

while ( get_values($id, $name, $salary) { $sth->execute($id, $name, $salary); }

When dealing with large data sets, scalars aren't always the best choice. Data is more likely to come in more complex structures, such as lists of lists or lists of hashes.

Inserting a list of lists

Syntactically, a lol is perhaps the most natural structure to pass to an execute method. This is because execute expects a list of values. Therefore an array is the logical candidate for this function.

my @employees_lol = ( [1, 'Fred', 5000], [2, 'Joshua', 5000] ); for (@employees_lol) { $sth->execute(@$_); }

The only issue with this insertion is that we must insure that the array elements are in the same order as the fields to insert. No shortcuts available here. Either we pay attention when we encode the array, or we ensure that the items are filled in the proper order when we get the data from an external source. Using a list of lists we trade speed for accuracy. However, if we have some means of making sure that the order is correct, this is the fastest insertion method available from the DBI.

I must mention that recent versions of the DBI allow also the method execute_array with the related bind_param_array, and it is useful for mass insertion. However, unless your DBMS supports such a method directly, execute_array is actually slower than other methods.

Inserting a list of hashes

If we want to be sure that each value is going to the corresponding column in the database, we can use a data structure where the values are referred by name, rather than by position. A list of hashes looks like the best solution for this problem.

The naive approach would be to iterate the list and access the items by name.

my @employees_loh = ( {id => 3, name => 'Kim', salary => 5600}, {id => 4, name => 'Dave', salary => 6000}, ); for (@employees_loh) { $sth->execute($_->{id}, $_->{name}, $_->{salary}); }

But this style of coding, although correct, may take long for a large number of columns, and we can still make a mistake and swap positions, thus resulting in an incorrect insertion. It's better to take advantage of our column list and exploit Perl capability of creating hash slices.

for (@employees_loh) { $sth->execute(@{$_}{@fields}) }

In the above example, not only we write less, but we are also sure that the order of values comes according to the columns list, which was the one we used to create the insertion query in the first place. For large datasets with many columns, this idiom is definitely the best way to go. Read more about this method in chromatic's article about the DBI (off site).

About fetching records

Once you have successfully inserted records into a database, you'd like to get them back, possibly in a format that suits your current needs, rather than the needs you had when the data was created. Fetching data from a database is a collaborative effort. First, you instruct the database about what you want to get, and the database will eventually comply and prepare for you a dataset ready for consumption. It's important to remember that, no matter how complex your query, the ultimate result will always be a bi-dimensional array, rows and columns filled with the data you asked for.

But you don't want always bi-dimensional arrays. Your application needs may be less (or more) demanding than that. Therefore you need to transform the result through the DBI, converting the dataset into the data structure more suitable for your current task. Even though you can't force the DBMS to change the only structure it can deal with, it will assist you in your translation.

The following sections examine a few of the idioms offered by the DBI when fetching records. Unlike the DBI manual, where you get help on specific instructions, the matter in this post is presented by task. Hopefully, you can find the recipe tailored for your application.

Creating an array from a column

Let's assume you need to get all the names from the employee table, and store them into an array. Depending on how your query is written, you have several choices.

Before examining the idiomatic solutions, let's have a look at the brute force approach:

my $query1 = qq{select name from employees}; my $sth= $dbh->prepare($query1); $sth->execute; my @names =(); while (my ($name) = $sth->fetchrow_array) { push @names, $name; }

This code serves the purpose and it gets the job done. However, there are better ways.

Using map and selectall_arrayref

An easier way is to fetch the entire dataset as an array ref (actually, it's a reference to an array of references) and then filter the wanted column with map.

my $query2 = qq{select id, name, salary from employees}; my @names = map {$_->[1]} @{$dbh->selectall_arrayref($query2)}; __END__ @names = (qw(Fred Joshua Kim Dave Sal));

Notice that we could have just selected "name", like we did in the previous example. The reason for the extra columns is to demonstrate how to use this method in cases where you have limited control over the columns returned by the DBMS. For example, the output for the DESCRIBE command in Oracle or MySQL will return multiple columns, and you may want to isolate only one of them.

Coming back to the example at hand, the result from selectall_arrayref is a reference to a bi-dimensional array. Using the @{} construct it is turned into an array, and map will only filter the second element from each row. As a result, we get a flat array containing only the names.

Using selectcol_arrayref

The DBI has also a specialized function to accomplish this task. selectcol_arrayref returns a reference to an mono-dimensional array. If we have a query that should return only one row, or having the wanted row in first position, then this method returns what we want. The only thing we have to do is de-referencing the array.

# # either # my $names_ref = $dbh->selectcol_arrayref($query1); # # or # my @names = @{$dbh->selectcol_arrayref($query1)}; __END__ $names_ref = ['Fred', 'Joshua', 'Kim', 'Dave', 'Sal']; @names = ('Fred', 'Joshua', 'Kim', 'Dave', 'Sal');

using selectall_arrayref and an array slice

The last one was easy. But we can make our life more complicated, if we want, using yet another approach.

my $sth = $dbh->prepare($query2); $sth->execute(); my @names = map {$_->[1]} @{$sth->fetchall_arrayref([0,-2])}; __END__ @names = (qw(Fred Joshua Kim Dave Sal));

The cryptic looking code in this example is just for educational purposes. I wouldn't do it in a real application, but again it could be useful to know that it is possible. fetchall_arrayref accepts an optional parameter, a "slice," telling the DBI which columns we want and in which format. If we use an array slice, as in this example, its indexes mark the columns to fetch. [0, -2] means take the first and second-to-last column. Therefore, our dataset will have two columns. Then, map will filter the second column, which is the name. Of course, in a three-column dataset, [0, -2] is the same as [0, 1], or [-3, -2]. Just to show what you can do.

Getting a list of lists

No sweat here. If we want a list of lists, that's what a RDBMS is always happy to provide you and the DBI is glad to help.

There are two methods to get a lol from the DBI. Either you use the prepare /execute / fetchall sequence, or you use selectall_arrayref, which is a handy shortcut. The following snippets are equivalent.

# # either # my $employees_lol = $dbh->selectall_arrayref($query2); # # or # my $sth = $dbh->prepare($query2); $sth->execute(); my $employees_lol = $sth->fetchall_arrayref(); __END__ $employees_lol = [ [ '1', 'Fred', '5000' ], [ '2', 'Joshua', '5000' ], [ '3', 'Kim', '5600' ], [ '4', 'Dave', '6000' ], [ '5', 'Sal', '4500' ] ];

Getting a list of hashes

A list of hashes is a common need. For example, if you are using HTML::Template, you should find this method useful. You can either build the loh manually, fetching a hash and pushing it to an array, but the DBI can do it for you quite efficiently.

my $sth = $dbh->prepare($query2); $sth->execute(); my $employees_loh = $sth->fetchall_arrayref({}); __END__ $employees_loh = [ { 'salary' => '5000', 'id' => '1', 'name' => 'Fred' }, { 'salary' => '5000', 'id' => '2', 'name' => 'Joshua' }, { 'salary' => '5600', 'id' => '3', 'name' => 'Kim' }, { 'salary' => '6000', 'id' => '4', 'name' => 'Dave' }, { 'salary' => '4500', 'id' => '5', 'name' => 'Sal' } ];

Notice the empty hashref {} given as argument to fetchall_arrayref. That's an instruction to the DBI that you want each row of the result as an hash reference.

A shortcut for the above idiom is to use selectall_arrayref with the Slice parameter.

my $employees_loh = $dbh->selectall_arrayref($query2, {Slice => {} +});

Be aware that, when fetching hashes, there is a possible pitfall that you should be aware of. See the relevant section below.

Getting a partial list of hashes

If you want some degree of control over the result, you may limit the columns to fetch, by including in the hash reference the names of the wanted keys.

my $sth = $dbh->prepare($query2); $sth->execute(); my $employees_partial_loh = $sth->fetchall_arrayref({name =>1, salary=>1}); __END__ $employees_partial_loh = [ { 'salary' => '5000', 'name' => 'Fred' }, { 'salary' => '5000', 'name' => 'Joshua' }, { 'salary' => '5600', 'name' => 'Kim' }, { 'salary' => '6000', 'name' => 'Dave' }, { 'salary' => '4500', 'name' => 'Sal' } ];

The values you pass are not important. You could also say {name=> undef, salary => undef} and it would have been the same. Only the keys are relevant. Notice also that the order of your hashref can be different from the order of the resulting hashref, since hashes in Perl don't have a guaranteed order (You all should know that, but a reminder shouldn't hurt :) ).

Getting the whole dataset into a hash

If you aren't satisfied with the default data structures offered by the DBI, you can build your own ones, using some imagination. Transforming a dataset into a hash is easy.

Using map and selectall_arrayref

selectall_arrayref returns a list of lists. Using map, you can convert it into a simple hash.

my %employees_h = map { $_->[1], $_->[2]} @{$dbh->selectall_arrayref($query2)}; __END__ %employees_h = ( 'Fred' => '5000' , 'Joshua' => '5000' , 'Kim' => '5600' , 'Dave' => '6000' , 'Sal' => '4500' );

Using map, shift and selectall_arrayref

You can build a more complex structure, for example assigning to each hash value an array reference. Depending on which column you want to transform into a key, you can either assign the key directly or use shift if the desired key is in first position.

# # either # my %employees_h2 = map { $_->[0], [ $_->[1], $_->[2] ]} @{$dbh->selectall_arrayref($query2)}; # # or # my %employees_h2 = map { shift @$_, [ @$_ ]} @{$dbh->selectall_arrayref($query2)}; __END__ %employees_h2 = ( '1' => [ 'Fred', '5000' ], '2' => [ 'Joshua', '5000' ], '3' => [ 'Kim', '5600' ], '4' => [ 'Dave', '6000' ], '5' => [ 'Sal', '4500' ] );

If you wanted to assign "name" as the hash key, then you could either modify the query to have "name" as first column in the dataset or modify the map code to go with your wishes ( map {$_->[1], [ $_->[0], $_->[2] ] }).

Getting a hash of hashes

The default DBI method for a hash of hashes is selectall_hashref, which needs as arguments the query to execute and the column to use as key, indicated by position. The position starts with 1. Don't get confused with the array index that would be 0 to get the same column from one row of a lol.

my $employees_hoh = $dbh->selectall_hashref($query2, 1); __END__ $employees_hoh = { '1' => { 'salary' => '5000', 'id' => '1', 'name' => 'Fred' }, '2' => { 'salary' => '5000', 'id' => '2', 'name' => 'Joshua' }, '3' => { 'salary' => '5600', 'id' => '3', 'name' => 'Kim' }, '4' => { 'salary' => '6000', 'id' => '4', 'name' => 'Dave' }, '5' => { 'salary' => '4500', 'id' => '5', 'name' => 'Sal' } };

Unlike the manual example above, where the hash was created with map, you can only create a hash of hashes, and the key used is also repeated among the values. The usefulness of such structure is left entirely to the reader imagination.

A pitfall with hash of hashes

As always when using hashes, there may be some hidden traps that you need to take into account.

If we modify the latest example and we use "salary" as key, we get the following data.

my $employees_hoh2 = $dbh->selectall_hashref($query2, 3); __END__ $employees_hoh2 = { '5000' => { 'salary' => '5000', 'id' => '2', 'name' => 'Joshua' }, '5600' => { 'salary' => '5600', 'id' => '3', 'name' => 'Kim' }, '4500' => { 'salary' => '4500', 'id' => '5', 'name' => 'Sal' }, '6000' => { 'salary' => '6000', 'id' => '4', 'name' => 'Dave' } };

It looks almost the same as the previous one, except that there are only four records instead of five. Where is "Fred?"

If you remember how hashes work, the answer is easy. Since the key was "salary", when a new hash with key "5000" was created, its value was overwritten, and the previous one discarded.

Lesson learned: whenever you use selectall_hashref, be sure to use a unique key or you'll suffer a data loss.

Binding columns and fetching

fetchall_arrayref is sometimes the fastest method for retrieving records. It depends on the number of rows to fetch and on their size. (see Speeding up the DBI for a comparison.)

The fastest method, which you can always count on, is fetchrow_arrayref. It is even faster when you combine it with binding.

What is binding? To understand what it is and why it is convenient, let me tell you how the DBI fetches records from the DBMS. Whenever a dataset is ready, the DBI uses one variable to transport the record from the DBMS to your application. Then, the data is copied from that variable into yours.

With binding, you are telling the DBI "Don't use your bucket to fetch water. Use mine, so we don't waste time pouring from one to the other."

The DBI has two methods to bind column results to your variables.

  • bind_col establishes a relationship between one column and your variable.
  • bind_columns binds a whole row of the dataset to a list of variables.

Let's see how to use them.

Binding scalars

Binding scalars is straightforward. Just have your scalar variables ready and call bind_col for each column of the dataset. Columns are specified by position. The variable bound must be a reference to your scalar.

my ($id, $name, $salary) ; $sth = $dbh->prepare($query2); $sth->execute; $sth->bind_col(1, \$id); $sth->bind_col(2, \$name); $sth->bind_col(3, \$salary); print "$id\t$name\t$salary\n" while $sth->fetchrow_arrayref; __END__ 1 Fred 5000 2 Joshua 5000 3 Kim 5600 4 Dave 6000 5 Sal 4500

Apart from speed, the great advantage of binding scalars is that you can call columns by name, rather than by position, making your application clearer.

Notice that the binding must happen after the call to execute. According to the DBI docs, this is not mandatory and a DBD driver could modify this behavior. If you write code with portability in mind, try to follow this rule, though.

Binding a list

To simplify the binding process, you can use a list instead of binding each scalar individually.

$sth->execute; # # either # $sth->bind_columns ( \$id, \$name, \$salary ); # # or # $sth->bind_columns ( \( $id, $name, $salary ) ); print "$id\t$name\t$salary\n" while $sth->fetchrow_arrayref;

Remember that the syntax  \( list ) returns a list of references to each element of the list. It does not work with arrays, for which you shoud refer to the next section.

Remember also that bind_columns requires that you pass as many elements as there are columns in the SELECT statement.

Binding an array

Let's say you are willing to relinquish the easy way of referring to the columns by name and want to bind the result to an array, because you don't want to pollute your application with many variables. Therefore you decide to bind the result set to an array. You can't use the syntax in the previous section, since \@array is a reference to an array, while bind_columns requires a list of references.

Once more, map to the rescue, to help passing the reference of each array element to bind_columns. The only tricky part here is to establish the right number of elements for the array

my @empl = (undef, undef, undef); $sth->execute; # # either # $sth->bind_columns( map {\$_} @empl ); # # or (perhaps better) # $sth->bind_columns ( \( @empl ) ); print join( "\t", @empl), $/ while $sth->fetchrow_arrayref;

You can also combine scalars and arrays and do something hybrid like this one, even though I don't know what good it can do to you. If you have such an array already established for other purposes, remember that also this idiom is available. Notice that, since the array elements are already references, map intervention is no longer needed.

@empl = ( \$id, \$name, \$salary ); $sth->execute; $sth->bind_columns( @empl ); print "$id\t$name\t$salary\n" while $sth->fetch;

Fetching a hash without speed penalty

In another article I measured the relative speed of fetching methods, and fetchrow_hashref came as the slowest one. So much slow, in fact, that you often choose not to use it, reasoning that you'd rather call columns by position than suffer the speed penalty.

However, if your only concern is to refer to columns by name, you can use a hash with almost the same efficiency you'd get with an array reference.

Binding a hash

Binding a hash is not difficult, once you know the syntax. It is as easy as binding a scalar, since you are not binding a hash, but the hash values.

my %rec = ( emp_id => undef, first_name => undef, monthly_payment => undef ); $sth->execute; $sth->bind_col(1, \$rec{emp_id}); $sth->bind_col(2, \$rec{first_name}); $sth->bind_col(3, \$rec{monthly_payment}); print "$rec{emp_id}\t", "$rec{first_name}\t", "$rec{monthly_payment}\n" while $sth->fetchrow_arrayref;

There are two distinct advantages here. The first is speed. You are fetching records at the same speed as fetching an array reference. And second, you are referring to columns by name. Not only that, but you are using the names that you have chosen, not the ones coming from the database, as in the case of fetchrow_hashref.

The difference between this idiom and fetchrow_hashref that you should keep in mind, is that with fetchrow_hashref you are creating a new hash at each loop, while here you are using the same data structure over and over. Therefore, if your purpose was to use a bound hash to push it into an array, forget it, because you would create an array of references where each element is a reference to the same hash. Use this idiom only if you want to improve your application readability, and referring to the columns by name.

Binding a large hash

Datasets can be quite large. Actually it is in such cases that you want to use hashes instead of arrays, because it is easy to use a record by position when you have just a handful of columns. If your result has fifty columns, instead, referring to them by position is a secure recipe for disaster.

On the other hand, binding a large hash to a result set could be a long task. Therefore we can use an idiom similar to the ones that have simplified our life when inserting records.

To carry out this task, we create an array of column names that must be in the same order as the ones in the SELECT statement. It could be the same list (recommended!), or an equivalent list with the names of our choice. After that, our ubiquitous map will get the job done.

my @fields = (qw(emp_id first_name monthly_payment)); $sth->execute; my %rec =(); $sth->bind_columns(map {\$rec{$_}} @fields); print "$rec{emp_id}\t", "$rec{first_name}\t", "$rec{monthly_payment}\n" while $sth->fetchrow_arrayref;

Comparing efficiency between fetchrow_hashref and manual hash fetching

If you want to be convinced of this hash fetching method efficiency, here is a simple comparison. Feel free to modify this code to suit your needs. The important thing to note is that a bound hash is almost as fast as an array reference.

use Benchmark (qw(cmpthese)); $sth = $dbh->prepare($query2); cmpthese (50000, { # adjust this to the speed of your machine fr_hashref => sub { $sth->execute; while (my $hash = $sth->fetchrow_hashref) { } }, fr_arrayref => sub { $sth->execute; while (my $aref = $sth->fetchrow_arrayref) { } }, man_hashref=> sub{ my %hash = (id =>undef, name => undef, salary => undef); $sth->execute; $sth->bind_columns( \$hash{id}, \$hash{name}, \$hash{salary}); while ($sth->fetchrow_arrayref) { } } }); __END__ (output edited for display purposes) Benchmark: timing 50000 iterations of fr_arrayref, fr_hashref, man_hashref... fr_arrayref: 5 wallclock secs ( 4.68 usr + 0.63 sys = 5.31 CPU) fr_hashref: 13 wallclock secs (12.13 usr + 0.83 sys = 12.96 CPU) man_hashref: 6 wallclock secs ( 5.08 usr + 0.68 sys = 5.76 CPU) Rate fr_hashref man_hashref fr_arrayref fr_hashref 3858/s -- -56% -59% man_hashref 8681/s 125% -- -8% fr_arrayref 9416/s 144% 8% --

A trap to avoid with fetchrow_hashref

Before closing this long review, I would like to warn against another hidden problem with hashes. It is not something that you find in your everyday applications, but it can happen. So here it goes.

The problem arises when your result set, by mean of a JOIN, has one or more columns with the same name. In this case, an arrayref will report all the columns without even noticing that a problem was there, while a hashref will lose the additional columns. Let's consider a simple example (which is meaningless in its stand-alone format, but it's just for the sake of the example).

$sth = $dbh->prepare(qq{ SELECT a.id, b.id FROM employees a INNER JOIN employees b ON (a.id=b.id) WHERE a.id = 1 }); $sth->execute; print "using a hashref\n"; while (my $hashref = $sth->fetchrow_hashref) { for (keys %$hashref) { print "$_ => $hashref->{$_}\t" } print $/; } # # using a hashref # id => 1 # $sth->execute; print "using a arrayref\n"; print "columns => ",join( ",", @{$sth->{NAME}}), $/; while (my $aref = $sth->fetchrow_arrayref) { print "@$aref\n"; } # # using a arrayref # columns => id,id # 1 1

The CROSS JOIN in this query will return one row with two columns, both containing '1'. Using a hashref, you'll get only one, while using an array-based fetching method you get both of them.

Of course, you can solve this problem by using aliases in your query (SELECT a.id AS id1, b.id AS id2) but you never know when code written by somebody else could end up this way. Just be aware of this remote possibility if you find less columns than you were expecting.

Conclusion

This is hardly the end of it. Perl is such a rich language and DBI is such a vast module that you can discover new idioms every day. Nevertheless, the list of idioms in this article should be enough to satisfy the average user's curiosity.

If you have any good idiom with the DBI, let me hear about it!

Update (1)
Fixed some spelling mistakes. Thanks to demerphq and jest

Update (2)
Fixed a link. Thanks to jdtoronto

Update (3)
Minor addition to array binding.

Update (3)
DBI Recipes now mentioned in DBI docs

Update (4)
Added a shortcut with selectall_arrayref and Slice. Thanks to blokhead for the hint.

 _  _ _  _  
(_|| | |(_|><
 _|   

Replies are listed 'Best First'.
Re: DBI recipes
by adrianh (Chancellor) on Aug 17, 2003 at 21:20 UTC

    Nice. ++.

    A variation on the hash-binding theme is to use the contents of the statement handles NAME attributes as the hash keys. Since in most instances you'll be interested in all the keys you are selecting this can be a useful shortcut. For example (untested code):

    my $sth = $dbh->prepare( "SELECT foo, bar, ni from some_table" ); $sth->execute; my %row; $sth->bind_columns( \( @row{ @{$sth->{NAME_lc} } } )); while ($sth->fetch) { print "foo = $row{foo}, bar = $row{bar}, ni = $row{ni}\n"; }

    In general the NAME attributes of statement handles are useful to avoid duplicating lists of fields in your SQL and in @fields type arrays.

Re: DBI recipes
by EdwardG (Vicar) on Aug 18, 2003 at 07:52 UTC

    ++ Very useful node, thanks gmax.

    For the sake of portability it might also be useful to note that ANSI SQL-92 provides a standard for querying table schema information:

    select TABLE_NAME from INFORMATION_SCHEMA.TABLES

    And to give an idea of what else is available via INFORMATION_SCHEMA;
    SCHEMATA DOMAINS VIEWS COLUMNS TABLE_PRIVILEGES COLUMN_PRIVILEGES USAGE_PRIVILEGES DOMAIN_CONSTRAINTS TABLE_CONSTRAINTS REFERENTIAL_CONSTRAINTS CHECK_CONSTRAINTS KEY_COLUMN_USAGE ASSERTIONS VIEW_TABLE_USAGE VIEW_COLUMN_USAGE CONSTRAINT_TABLE_USAGE CONSTRAINT_COLUMN_USAGE CONSTRAINT_DOMAIN_USAGE
    Source: An Introduction to Database Systems (7th edition), C.J. Date
      I'd like to point out that while most databases are mostly SQL-92 compliant, INFORMATION_SCHEMA.TABLES is not implemented in Oracle 9.2.0.4 (the latest patch release, that I'm aware of).

      I don't know if any other database implements this or not.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        It's a fair point you make. Both Oracle and Microsoft continue to stretch the meaning of compliance.

        It is almost meaningless now but at this rate by 2020 it could simply mean that the product engineers know how to spell I-S-O.

Re: DBI recipes
by Abigail-II (Bishop) on Aug 17, 2003 at 23:51 UTC
    Too bad you aren't checking whether your actions succeed, and neither are you using transactions when you are modifying your database. Now your examples are equivalent of Perl code that isn't strict, doesn't have warnings turned on, and isn't checking the return values of system calls.

    Pity.

    There's no execuse to not wrap your database modifying statements inside a transaction, and to not rollback on failure.

    As for checking for the existance of a table, that's rather easy in Sybase and MSSQL:

    SELECT 1 FROM sysobjects WHERE name = "whatever" -- Name of the table AND type = "U" -- User defined table. -- Use "S" for a system table.

    Abigail

      Thanks for the Sybase syntax.

      There's no execuse to not wrap your database modifying statements inside a transaction, and to not rollback on failure.

      Nonetheless, I'll give you some.

      I had error checking and transaction support in my test script, but I removed them on purpose.

      Why?

      • This post is focused on idioms, and I didn't want to burden the examples with something that doesn't belong there.
      • Error checking is an idiom in itself. You can check for errors by testing the object, by creating a handle sub, by using eval. Depending on your needs, the error checking could take several shapes, and I have approached this subject elsewhere.
      • Transactions are not a mandatory DBI feature. The DBI can deal with the big boys (Oracle, MS SQL, IBM DB2, and so on) but it can also handle DBF files, Excel spreadsheets, CSV files, and others where transactions are not supported.
      • When transactions are supported, while it is true that every update BELONGS TO a transaction, it is not true that every update IS a transaction (Update meaning that the transaction-related code could be around a group of statements rather than around a single one.) So, in addition to my goal of keeping the code focused on the example, I didn't want to give the impression that the transaction code is necessary for that particular idiom to work.
       _  _ _  _  
      (_|| | |(_|><
       _|   
      
        When transactions are supported, while it is true that every update BELONGS TO a transaction, it is not true that every insertion IS a transaction.
        I'm nitpicking, but with Oracle, DB2, Sybase, etc. every DML operation is run in a transaction. The transaction may automatically commit at the end of the operation (i.e. if AutoCommit is on), but there is always at least an implicit transaction as even a single row insert may in fact generate more operations through triggers (inserts to a shadow table, updates of summary tables, etc.) and these all will be guaranteed to perform as a single operation, even in the absence of explicit transactions in the DML.

        That being said I think that keeping the transaction logic out of the examples is a good thing, as long as their use and functionality is explained somewhere in the document - after all transactions are pretty central to RDBMS systems...

        Michael

        Error checking is an idiom in itself.

        Is it? If you were to discuss idiom for reading in a file line-by-line, would you present something like:

        open my $fh, "/path/to/file"; while (<$fh>) { chomp; ... }

        and dismiss the checking of the return value of open "it being an idiom in itself"? Error checking should be part of the idiom, and not something you bolt on later, when you are more experienced.

        Abigail
        --
        Now that you have passed your driving test, let me introduce you to the function of safety belts.

        I, for one, appreciate that you've focused on just one thing. This stuff is confusing enough as it is.
      DBD::Sybase supports the tables() method.

      However, this relies on the existence of the sp_tables stored procedure, which I know exists in most recent versions of Sybase (i.e. 11.x and later), but I don't know if MSSQL has it.

      Michael

        For the record, sp_tables exists in MSSQL versions 7, 2000, and as at 2005-10-14, also in the MSSQL 2005 CTP.

         

      Like many monks rewrite code to improve it, I am going to rewrite the first three paragraphs in the attempt to add a touch of tact to them:
      Thank you for contributing gmax, that is an impressive start of a cookbook. But ... shouldn't you have incorporated more error checking and transaction support? These are, IMHO, an integral part of DBI - actually, I strongly feel that there's no excuse to not wrap your database modifying statements inside a transaction, and to not rollback on failure. Along those lines, one could accuse your examples of being equivalent to Perl code that isn't strict, doesn't have warnings turned on, and isn't checking the return values of system calls.
      By the way, Re: Style, *again* appears to be the only node where you actually thanked someone here. Many, many have thanked you in the past - you should return the favor more often, if nothing more than for being appreciated.

      Years ago my mother used to say to me, she'd say, "In this world, Elwood, you must be" -- she always called me Elwood -- "In this world, you must be oh so smart or oh so pleasant." Well, for years I was smart. I recommend pleasant. And you may quote me.

      Elwood P. Dowd

      Abigail---how about a compromise here? You write the section on error handling and gmax folds it into the tutorial?

      --hsm

      "Never try to teach a pig to sing...it wastes your time and it annoys the pig."
        gmax could also just look into the DBI manual page, and fold that into the tutorial.

        Abigail

Re: DBI recipes
by jdtoronto (Prior) on Aug 18, 2003 at 12:39 UTC
    As others have said - awesome!

    Your reference to chromatics article at perl.com is incorrect, http://www.perl.com/pub/3001/03/dbiokay.html should refer to 2001 not 3001. He isn't really that old, nor am I :)

    ...jdtoronto

Error handling and transactions
by runrig (Abbot) on Jul 11, 2004 at 20:12 UTC
    Just to answer Abigail-II's concerns mentioned earlier, I'll mention that there is a section in the DBI docs on Transactions that goes over an idiomatic method of error handling. I won't repeat it in detail, but to summarize, set RaiserError on, wrap all DBI operations in one big eval block, check $@ after exiting the eval block, and rollback the transaction if there was an error (update: and commit if there was no error). You can even set RaiseError during the connect (in which case you would not rollback if there were an error during the connect), and do the connect inside the eval block, which the example in the docs does not do.

    An updated link: DBI transactions

      Just to be a little more concrete, I think the section being referred to is here.
Re: DBI recipes
by vek (Prior) on Aug 17, 2003 at 19:32 UTC
Re: DBI recipes
by polettix (Vicar) on Aug 26, 2006 at 00:01 UTC
    Regarding the automatic construction of the placeholder list when dealing with insertions:
    # ... my @fields = (qw(id name salary)); my $fieldlist = join ", ", @fields; my $field_placeholders = join ", ", map {'?'} @fields;
    I personally prefer to avoid the map and play with the repetition operator:
    my field_placeholders = join ', ', ('?') x @fields;
    But I don't really know if this operator was available back in 2003 :)

    Flavio
    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Don't fool yourself.
Re: DBI recipes
by dragonchild (Archbishop) on Nov 02, 2016 at 20:34 UTC
    13 years later, this was valuable to me and appeared in Google's search. Thank you.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: DBI recipes
by jcb (Parson) on Aug 11, 2019 at 00:48 UTC

    There is another way to handle binding array elements that I adapted from the example given in the DBI documentation:

    my @row = (); $sth->bind_columns( \( @row[0..6] ) );

    This uses Perl's autovivification to create the needed elements (seven in the example) and pass the references to bind_columns. The array slice provides the length, so you can start with an empty array. This is adapted from the example that binds to the values inside a hash and instead binds to the values inside an array.

    Generally, with this many elements, you probably want named variables, but this example is from code that loads a table into an array for display in a Tk::TixGrid widget with minimal other processing.

Re: DBI recipes
by Anonymous Monk on Jan 24, 2010 at 22:25 UTC
    I try the sample of "Binding a large hash" because perl is ten times slower than php to get data...and I lost 5 seconds!
    - 349s in perl with "Binding a large hash" method
    while ($sth->fetchrow_hashref)
    - 344s in perl with while ( my $hash_ref = $sth->fetchrow_hashref)
    - 32s in php with while ($row = db_fetch_array($statement,OCI_ASSOC))
        This seems to be treasure of DBI recipies --Prasad
      Edit: db_fetch_array oci_fetch_array

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://284436]
Approved by larsen
Front-paged by valdez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-09-07 11:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.