Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Code factory

by Notromda (Pilgrim)
on Jul 07, 2003 at 17:53 UTC ( [id://272042]=perlquestion: print w/replies, xml ) Need Help??

Notromda has asked for the wisdom of the Perl Monks concerning the following question:

I'm starting a large (for me) project that has quite a few mysql tables, and I need to write quite a few functions to access these tables. I discoved that a lot of the code is the same, differing only in table names and such. For example,
sub getCustomerById { my $id = shift; my $dbh = sqlConnect(); my $sth = $dbh->prepare("SELECT * FROM customers WHERE cust_id=?") +; $sth->execute($id); my $result = $sth->fetchrow_hashref(); $sth->finish(); return $result; } sub getUserByUserId { my $id = shift; my $dbh = sqlConnect(); my $sth = $dbh->prepare("SELECT * FROM users WHERE user_id=?"); $sth->execute($id); my $result = $sth->fetchrow_hashref(); $sth->finish(); return $result; }
And there will be many similar instances. Is there a better way to generate all these funcions? Is that good practice, or should they all be written out in a source file somewhere?

Replies are listed 'Best First'.
Re: Code factory
by tadman (Prior) on Jul 07, 2003 at 17:59 UTC
    It's a simple case of re-factoring to strip out this redundancy. Make a simple template function first:
    sub getById { my ($table, $column, $id) = @_; my $dbh = sqlConnect(); my $sth = $dbh->prepare("SELECT * FROM $table WHERE $column=?"); $sth->execute($id); my $result = $sth->fetchrow_hashref(); $sth->finish(); return $result; }
    Now you can call this function with the appropriate parameters, or, try and make these wrapper functions:
    sub getCustomerById { my ($id) = @_; return getById("customers", "cust_id", $id); }
    This can be tuned to match any of your various "getXById" functions by simply supplying different parameters.
      Why not go a step farther?
      sub getByFieldval { my ($table, $field, $val) = @_; my $dbh = sqlConnect(); my $sth = $dbh->prepare("SELECT * FROM $table WHERE $field=?"); $sth->execute($val); my $result = $sth->fetchrow_hashref(); $sth->finish(); return $result; }
      Of course I would probably do more error checking.

      --

      flounder

(jeffa) Re: Code factory
by jeffa (Bishop) on Jul 07, 2003 at 19:45 UTC
    A simple naming convention could be a big help here. Anytime i create a database table, i always (99% of the time) use an "auto increment" id field named id. Not cust_id. Not user_id. Just id. After all, if i know the name of the table, i know what kind of id it is. Now, if that id is referenced in another table, then i append the full table name. Example:
    foo: id
         name
         status
    
    bar: id
         name
         status
         foo_id
    
    By keeping a convention such as this, i look up field names less often, and i can also take advantage of this in my code, especially when having to insert into more than one table at a time (second and third normalized tables).

    Class::DBI is a difficult module to master, but it powerfully abstracts such details away and allows you retrieve rows easily:

    my $cd = CD->retrieve(1); my @cds = CD->retrieve_all; my @cds = CD->search(year => 1980); my @cds = CD->search_like(title => 'October%');
    But once you have to start joining tables together, things can get a bit tricky. It is worth taking a look at, however.

    UPDATE:
    Oh, in the spirit of TIMTOWTDI, here is my version of a generic insert:
    sub generic_id_fetch { my ($table,$id,@field) = @_; my $field = @field ? join(',',@field) : '*'; $dbh->selectrow_hashref("select $field from $table where id=?",unde +f,$id); }
    And yes, i hardcoded the name of the id field. YMWV (your mileage will vary). Hope this helps. :)

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Code factory
by hardburn (Abbot) on Jul 07, 2003 at 18:07 UTC

    First thing I see wrong is the use of SELECT *. Don't do that. It's slow. It hurts maintainability. It was meant for interactive use, and has no buisness being in a program.

    As for your actual question, I see two ways of doing this. Either create a closure and insert it into the main:: package space (or your current package, as the case may be), or create a subroutine which takes in SQL and a list of params for $sth->execute() and have it return the hashref. The closure version is a bit more advanced, but I suspect it will be more efficent. An example of using closures for this is in the Camel on pages 338-339.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

      Everyone busts on SELECT * as if it eats babies for breakfast and feasts on the blood of virgins in the afternoon. Come on! It might be a bit lazy, but if used carefully, it's not always that evil.

      In this specific example, the most data you're going to get is a single row, so SELECT * isn't quite as unruly as you'd expect. It's probably not going to return ten thousand columns. Granted, one should only request what they're looking for, but when you're not sure, you might as well just snatch it all than be caught short later. Memory today is measured in megabytes, not kilobytes, so you can be a little more laisez-faire about these things.

      Additionally, this is all put into a HASH, so, even if the columns are re-ordered, there's no real risk of damage to the program, provided all the requisite columns are still there. This, of course, is not guaranteed, so one must be careful about column changes, and test code thoroughly.

      I agree an unconditional SELECT * where you do combine that with a bind_param_array is playing with fire, but this code isn't quite that risky.
        If someone changes the name of a column without changing the code, and you have that column listed in the select, you will get an error message right away. If you select * and put it in a hashref, you might never get an error message at all from that situation. You'd just wonder why "last_name" was empty on all your reports for the last few months.
        Granted, one should only request what they're looking for, but when you're not sure, you might as well just snatch it all than be caught short later.

        Wrong. If you're not sure about the column names you are supposed to be getting from a table, you need to look it up and be sure before you "finish" writing the perl script (actually, before you code beyond the line that specifies the text of the query statement). If necessary, use additional queries in the perl script to look up the "meta-data" provided by the database server -- the stuff that describes the user tables -- to figure out column names and data types. (But that should never really be necessary, unless you're actually writing a tool specifically to probe a database schema.)

        I have seen a number of my colleagues get burned (and I've been burned after some of them left our shop) because someone added a column to a table, and suddenly their logic for handling the results of "select * ..." was broken in "mysterious, incomprehensible" ways.

      A note on select *. In some db's, you can do subselects with select * that increase maintainability. You can get things like:
      select drink.id, drink.name from ( select * from beverage where type = ? ) drink
      now mind you, it may not be as efficient as explicitly doign every single column, but if the sub query produces negligible data-size difference (in bytes) then you can definitely get away w/ it and have virtually no performance difference.

      Granted, this query can be rewritten so that it doesn't use a subquery, but subqueries are a feature that is needed sometimes. This is just merely an example showing only the features of a subselect and select * :)

Re: Code factory
by sauoq (Abbot) on Jul 07, 2003 at 19:54 UTC
    Is there a better way to generate all these funcions?

    Use closures. This should get you started:

    sub make_closure { my ($table, $column) = @_; return sub { my $id = shift; my $dbh = sqlConnect(); my $sth = $dbh->prepare("SELECT * FROM $table WHERE $column=?" +); $sth->execute($id); my $result = $sth->fetchrow_hashref(); $sth->finish(); return $result; }; } local *getCustomerById = make_closure("customers", "cust_id"); local *getUserByUserId = make_closure("users", "user_id");
    -sauoq
    "My two cents aren't worth a dime.";
    
      Can you take this a step further and show how I could print out the users name, assuming one of the 'user' columns is 'name'? I understand the closure returns a reference to a subroutine - sort of - so, *getUserByUserId is a typeglob(is that the right term?) containing a reference to a subroutine. I don't know how to take that and actually refer to the user's name. Help me understand please :)
        I don't know how to take that and actually refer to the user's name.

        That local *getUserByUserId = make_closure( 'users', 'user_id' ); simply gives a name to the sub ref returned by make_closure(). You can call it exactly as if you had the code sub getUserByUserId { . . . } instead. To get at the name, you only have to know that the function returns the results of a fetchrow_hashref(). So, something like this should do fine:

        my $user_data = getUserByUserId( $id ); my $user_name = $user_data->{name};

        -sauoq
        "My two cents aren't worth a dime.";
        
        It's now a function. Treat it as such. my %vals = getUserByUserId('foo');

        ------
        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      Both flouder99 and sauoq have good answers. The closures method is what I was originally looking for, but simply having a generic function works as well. I'm also trying to look at this from the perspective of who uses these functions. Is it better to have a lot of different function names, or one function that can take a lot of arguments?

      Opinions?

        Is it better to have a lot of different function names, or one function that can take a lot of arguments?

        Generally, reducing the number of functions in your API is a good thing.

        What do you mean by "one function that can take a lot of arguments" though? Functions shouldn't take too many distinct arguments, but it is perfectly fine for there to be many acceptable values for those arguments.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Code factory
by shemp (Deacon) on Jul 07, 2003 at 18:46 UTC
    Here's something i've been using for quite a while:
    sub make_select_query { my ($table, $select_fields, $conditions) = @_; my $query = "SELECT "; if ( defined $select_fields ) { if ( ref($select_array) eq "ARRAY" ) { $query .= join ",", @$select_array; } else { $query .= $select_fields; } } else { $query .= "\*"; } $query .= " FROM " . $table; # only put the where and conditions in the query string # if conditions are present if (keys %$conditions) { $query .= " WHERE "; foreach my $key (keys %$conditions) { push @conds, $key . "=\'" . $conditions->{$key} . "\'"; } } $query .= join " and ", @conds; return $query; }
    Of course this has a number of shortcomings, such as:
    All conditions use equality. This could be worked around by having a condition hash like this instead:
    ... $condtions = { "age" => " >= 21 ", "sex" => " = female ", };

    But this is leading down the path that the more complicated your need, the less likely there will be a general solution that does exactly what you need.

      My select sub (part of Coruscate::DB (not on cpan) package):

      sub select { my ($self, $want, $query, @args) = @_; die "Not connected to a database!" unless defined $self->{'dbh'}; my $results; if (ref $want eq "HASH") { my $sth = $self->{'dbh'}->prepare("SELECT $query"); $sth->execute(@args); while (my $row = $sth->fetchrow_hashref()) { push @{$results}, { map { $_, $row->{$_} } keys %{$row} }; } } elsif (ref $want eq "ARRAY") { $results = $self->{'dbh'}->selectall_arrayref("SELECT $query", {}, @ +args); undef $results if $#{$results} < 0; } return $results; }

      I absolutely love my little module. Extremely simple interface that can do everything. Sweet :)


      If the above content is missing any vital points or you feel that any of the information is misleading, incorrect or irrelevant, please feel free to downvote the post. At the same time, please reply to this node or /msg me to inform me as to what is wrong with the post, so that I may update the node to the best of my ability.

        Instead of doing this:
        push @{$results}, { map { $_, $row->{$_} } keys %{$row} };
        Wouldn't it be better/faster to just do this:
        push @{$results}, $row;
        You already have a hash reference, why recreate it?
Re: Code factory
by dragonchild (Archbishop) on Jul 07, 2003 at 20:05 UTC
    A few notes:
    1. Use prepare_cached(). These look to be often-called functions. prepare_cached() will reduce the cost after the first one by caching the prepared statement for later use.
    2. If at all possible, don't use "SELECT *" and don't use fetchrow_hashref(). Instead, use "SELECT A, B, C" and bind_columns() and fetch().
    To implement the second solution, I'd do something like:
    sub build_selector { my ($dbh, $table, $field, @selectables) = @_; my $selectables = join ',', @selectables; return sub { my ($val) = @_; die "No val passed into selector" . $/ unless defined $val && length $val; my $sth = $dbh->prepare_cached(<<"_SQL_"); SELECT $selectables FROM $table WHERE $field = ? _SQL_ die "Couldn't prepare handle: " . $dbh->errstr unless $sth; $sth->execute($val) || die "Couldn't execute handle" . $dbh->errstr; my @vals = $sth->fetchrow_array; $sth->finish; return wantarray ? @vals : \@vals; }; }
    Even though it's below, this code is untested. Use at your own risk.

    Update: Removed superfluous quotes around the placeholder.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      It's also important to remember to use parameters when using prepare_cached.
      select * from table where lname = ?
      rather than
      select * from table where lname = $foo
      You should cache one statement rather than many small variations.
Re: Code factory
by matsmats (Monk) on Jul 08, 2003 at 12:28 UTC

    Try DBIx::SQLEngine. It's a high level interface to SQL in the same vain as some of the suggestions you have gotten here. And its quite wonderful to use if it suits your program.

    It works typically like this:

    $sqldb->do_insert( table => 'students', values => { 'name'=>'Dave', 'age'=>'19', 'status'=>'minor' } ); $hashes = $sqldb->fetch_select( table => 'students', criteria => { 'status'=>'minor' } );

    If your project is large and might be running for a while, you should wrap all the different calls to the database in separate functions even if you've got a nice generalized interface. You will eventually have to make some workarounds or sanity checking on the data before using them, and it's thankful to have that located one place.

    Mats

      If your project is large and might be running for a while, you should wrap all the different calls to the database in separate functions even if you've got a nice generalized interface. You will eventually have to make some workarounds or sanity checking on the data before using them, and it's thankful to have that located one place.

      I think this is why I'm leaning toward having the many function names, even if using the factored function for the basic inserts, as tadman or flounder99 suggested. This is a mod_perl application, so there's got to be some good data checks somewhere.

      I also don't like the look of a bunch of calls like:

      getByFieldval("sometable", "somecolumn",$value1);
      all over the place. I have trouble remembering the table names and column names... Shouldn't various layers of the application be somewhat insulated from one another? Data encapsulation and all that... :)
Re: Code factory
by Joost (Canon) on Jul 08, 2003 at 14:40 UTC
    Why not use one of the available object oriented persistence modules available on CPAN. Look at this example from Class::DBI:
    package Music::DBI; use base 'Class::DBI'; Music::DBI->set_db('Main', 'dbi:mysql', 'username', 'password'); package Artist; use base 'Music::DBI'; Artist->table('artist'); Artist->columns(All => qw/artistid name/); Artist->has_many('cds', 'CD' => artist); # ... my $artist = Artist->retrieve($id);
    (This module can actually do a lot more interesting things than that)

    A good place to look for more info on object oriented persistence is the POOP comparison website.

    -- #!/usr/bin/perl -w use strict;$;= ";Jtunsitr pa;ngo;t1h\$e;r. )p.e(r;ls ;h;a;c.k^e;rs ";$_=$;;do{$..=chop}while(chop);$_=$;;eval$.;
Re: Code factory
by blue_cowdawg (Monsignor) on Jul 08, 2003 at 22:12 UTC

    What I've standardized on for my database type applications is a series of object oriented modules that get extended by (for a lack of better term) child modules.

    The Data Object

    Let's start with the dataObj module. I mentioned in another thread that I keep my database and other generic information in another module and I'll hand wave that here.
    package dataObj;
    
    use dbConf;   # This module has the db connect info
    
    my $connected=0;
    
    
    # this rarely gets actually invoked.
    sub new { 
       my $proto=shift;
       my $class = ref($proto)|| $proto;
       my $self = { dbh=> undef,
                    sth=> undef
                  ... and other goodies
                  }
    
        bless $self,$class;
    
        $self->connect2db if not $connected; 
    
        return $self;
    }
    
    sub connect2db {
         my $self=shift;
         my $method=( $_[0] ? $_[0] : "oracle");
    
         my $config=new dbConfig($method);
    
         # The connectParms method of dbConfig returns an 
         # array with the DSN, LOGIN and PASSWORD info
         $self->{dbh} = DBI->connect($config->connectParms)
                or die "Could not connect " . $DBI::errstr;
    }
    
    sub prepare {
        my $self=shift;
        my $sql=shift;
    
        die "Preapre without connect!" if not $connected;
        $self->{sth}=$self->{dbh}->prepare($sql)
              or die "Could not prepare\n" . $sql . "\n"
                      . $self->{dbh}->errstr;
    }
    
    sub execute {
        my $self=shift;
    
        return if not $self->{sth};
    
        $self->{sth}->execute(@_);
    }
    .... and so on...
    
    1;
    
    

    CAVEAT:The above code is just to get the idea across and is not to be considered prime time code

    I would then "overload" the module with the data object in question. Presumably I would put methods into the data object that deal with types and such intelligently

    Example:

    package dataObj::contactInfo;
    use vars qw/ @ISA /;
    use dataObj;
    @ISA=qq(dataObj);
    
    sub new {
        my $proto=shift;
        my $class=ref($proto)||$proto;
        my $self= {
                   name => undef,
                   email => undef
        };
    
         bless $self,$class;
    
         $self->connect2db("mysql1");
    
         return $self;
    }
    
    sub get_contact_by_name {
         my ($self,$name)=@_;
    
         return if not $name;
         $self->prepare("select email from contacts where name= ?");
         $self->execute($name);
         my $vals=$self->fetchrow_hashref
         ...blah...blah....blah...
    
    
    
    1;
    

    Hopefully you get the idea.


    Peter L. BergholdBrewer of Belgian Ales
    Peter@Berghold.Netwww.berghold.net
    Unix Professional
Re: Code factory
by Anonymous Monk on Jul 09, 2003 at 04:25 UTC
    I see a whole bunch of comments here on how to access data in various ways, but I think what you're really looking for is a good way to have most of the code automatically generated. What you're looking at can be generated dynamically using any of their methods, or you can write a script to analyze the MySQL tables for unique keys and generate the functions that need to grab data for these keys.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://272042]
Approved by Mr. Muskrat
Front-paged by Enlil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-03-29 12:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found