Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Abstracting sql

by BUU (Prior)
on Jul 31, 2003 at 06:21 UTC ( #279495=perlmeditation: print w/replies, xml ) Need Help??

I'm designing a fun little web app and one of the challenges I've run in to was abstracting away the sql. My reasons to do so is two fold. The first is that I wish to be as database agnostic as I possibly can and the best way to achieve this seems to me to abstract out the sql, the actual part that interfaces with the database, in to a small class I could just swap in an out, for example: User_data_mysql and User_data_postgre and User_data_flatfiles. My second reason is just one of elegance.. it always seemed inelegant and messy to have random bits of sql strewn around.

After pondering it for a while, I've come up with two ideas. My first idea is that I could write database accessing functions in the form of one for each table for each type of database. Eg I would have a function like User_Data_Mysql::get_data; and I could pass it things like: get_data(['name','id','pw'],{order_by=>'id'}); but the problem there is that writing a function like that quickly becomes horrendously complicated to the point where it feels like I just rewrote sql, except in perl and much uglier. I suppose with proper planning this function wouldn't be too grotesque but I basically feel that $sql="select foo,baz,bar from qux where foo=1 and baz=2 order by bar limit 0,20" is much cleaner then say,  qux::get_data([qw/foo baz bar/],order=>{foo=>1},limit=>[0,20]);

My second option seems to me to just write all sql I would need and store them in functions, eg:  sub get_user_data_by_id; sub get_user_data_by_name; sub get_user_data_by_date; and so on and so forth. I could just wrap these functions in a module with whatever database they apply to and just do mysql::get_user_data_by_name; but of course this is A) massively inelegant and B) seems terribly fragile and just plain ugly.

Replies are listed 'Best First'.
Re: Abstracting sql
by mpeppler (Vicar) on Jul 31, 2003 at 06:36 UTC
    It's already been discussed several times around here. Personally I find that building a module that fetches data from table(s) such as you propose with get_data([qw/foo bar baz/]...) works for simple situations, but will rapidly break when you start having a complex database where multiple tables need to be accessed for any particular request.

    Therefore I much prefer creating logical requests that map to perl subroutines (i.e. addNewUser(), getUserInformation(), etc) that either call SQL stored procedures (my preference) or, if these aren't available, have the SQL in the subroutine.

    This will also have the advantage of forcing you to structure your app a little so that you only need to create a limited number of database access methods/subroutines. This will have a huge benefit on the maintainability of the database, and on figuring out what indexes are needed (f.ex. right now one of my clients has a largeish database (about 20GB) where there is very little structure - trying to figure out what indexes are needed, and what queries are actually run is a nightmare, requiring the use of "auditing" (i.e. recodring *all* SQL that is sent to the server) and sifting through gigabytes of SQL code...)


      I tend to use a combination of the packages/subroutines with business logic functionality, a database wrapper and an SQL phrasebook.

      By using Class::Phrasebook, I have successfully transfered from a CSV mockup, through a MySQL implementation to MS SQL Server 7 (and stored procedures), by only changing the phrasebook. The functionality and wrapper haven't changed, just the database. Having the SQL wrapped up in the code, for me, has the same argument as wrapping HTML up Perl or visa versa. It can get ugly and awkward to maintain.

      Using routines that implement the business logic is always a good thing. However the DB wrapper should be the one to handle all the connections, extract the SQL from the phrasebook, structure the data and handle any errors, so that if anything goes wrong, it's only one point of failure not potentially several hundred. My DB wrapper also handles the auditing required, and obtaining the ID for inserted records. To continually repeat this for each module containing business logic seems unnecessary.

      I've heard good things about Class::DBI, however, it does seem to be more aimed at SQL based queries not stored procedures.

      Barbie | Birmingham Perl Mongers |

        You might want to compare Class::Phrasebook to SQL::Catalog. I would be interested in hearing your feedback.

        Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality

      Well, I'm all for being logical in my requests and highly structured code. Unfortunately I can't used stored procedures as maybe fifty percent of the databases out there implement them. As to your other solution, I really like the idea of using data functions to map from perl to sql. Unfortunately it looks like using your examples would give me the worst of both worlds, as I would still end up with sql being strewn around in the middle of my code and I see no easy way to easily be "database agnostic" with those subs, other then just writing one set of subs for every single database I want to use.

      I suppose in the end I'm probably going to have to do something similar, with different modules containing functions for each different database, but the idea of having large numbers of subroutines that would essentially be exactly the same, for example, postgre subs and mysql subs would probably use almost the exact same sql and so forth, but I couldn't just use the same subroutines because of the one sub where I have to use different sql for whatever reason.

      I suppose that writing individual classes for each database and combining it with some sort of factory method would be the simplest way to handle it, perhaps something like:
      my $db_type = defined $use_mysql?'mysql':'postgre'; # really simplifie +d my $user = new User($db_type); $user->name; $user->date; $user->stuff;
        I really like the idea of using data functions to map from perl to sql. Unfortunately it looks like using your examples would give me the worst of both worlds, as I would still end up with sql being strewn around in the middle of my code and I see no easy way to easily be "database agnostic" with those subs, other then just writing one set of subs for every single database I want to use.
        Using these data functions you at least limit the SQL to a specific module (or set of modules), and this set of modules could quite easily be a hierarchy that takes database specificities into account.

        Another possibility is to store the SQL in a separate file (text file, DB_File or something else) indexed by logical request. Load the file at startup and map perl subroutines to SQL statements.

        There are always a bunch of ways to do these things... :-)


Re: Abstracting sql
by edoc (Chaplain) on Jul 31, 2003 at 06:26 UTC
      I was under the impression that Class::Dbi was primarily used as a method for storing objects in a relational database. I've looked at it, but it seems entirely too "heavy" and not granular enough for what I would need and I was under the impression that its performance profile left a little something to be desired..?

        well, yeh, kinda.. but maybe it sounds better if you turn that around to say that Class::DBI turns your records into objects. I think you would find it to be a huge head start for what you want from your abstraction layer.

        As for granular, once you've got your classes set up it's quite simple to create custom sql queries.

        The main performance hit that I watch is if you are doing a search that will return a large number of results. In this case Class::DBI takes the data from each record and initialises it into an object. If I just want the data then I write a customised query and method to go with it and return either the select handle or an array of hashes (or array of arrays..) which avoids 90% of the extra overhead Class::DBI would naturally impose.

        __PACKAGE__->set_sql( list => "select uname,fname,sname from users" ); sub list{ my ($class) = @_; my $sth = $class->sql_list(); $sth->execute(); return $sth->fetchall_arrayref({}); }

        Update: oops.. corrected code above..

        and when you want the user list..

        my @users = My::CDBI::User->list;

        With the common task of grabbing one record the automatic objectifying is great and makes it really simple to access/modify records.

        my $user = My::CDBI::User->retrieve(123); $user->fname('Fred'); $user->sname('Flintstone'); $user->update;



        Class::DBI really does sound like what you want here. It's not very heavy at all (the code is quite short) and it lets you address your tables as objects quickly and easilly. It also has very good performance.
        Class::DBI Intro
        This is the second time this week I have seen a reference to Class::DBI being used for storing objects this is simply not what it was made for. Class::DBI allows for working with the data as objects, basically one record = one object. It also allows for relationship integrity based on keys at an application level rather then the engine level. As for the performance issue, it would be subjective at best depending on size of project.
Re: Abstracting sql
by chunlou (Curate) on Jul 31, 2003 at 07:39 UTC

    When you need User_data_mysql, User_data_postgre, User_data_flatfiles, etc. or something like User_Data_Mysql::get_data, polymorphism will be a good option to consider.

    One reason you don't want to do $sql="select foo,baz,bar from qux where foo=1 and baz=2 order by bar limit 0,20" is that the DB becomes responsible for most of the error handling, whereas something like getdata(\%data) helps trap errors before it even reaches the DB

    You probaby would want to start with the design of a generic data access module all your app specific modules will use. That is, something like: --> --> --> MySQL --> --> --> Postgre --> --> --> FooDB

    MyDB will most likely have to be OO, since at least you have the persistent data object "DB Handler" hanging around. OO way is more convenient to handle it.

    MyDB won't have any table and app specific methods. Only a few generic database methods, such as "insert_row," "delete_row," "show_table_asXML," "show_table_asArrays," etc.

    It would work if you confine your methods to do the SQL stuff that are more or less supported by every DB. If not, you might want to split out some DB specific methods into some DB specific module.

    At application level, you might want to create two types of modules, one is tables specific but not business logic specific (say,; the others are business logic specific (such,, etc.). Something like:

    MyAppMessageBoard --> MyAppTable --> MyDB --> DBI --> MySQL
    MyAppThumbnail --> MyAppTable --> MyDB --> DBI --> MySQL
    MyBizLogic --> MyAppTable --> MyDB --> DBI --> WhateverDB

    If you want to create methods with hardcoded parameters such as "get_user_data_by_id," "get_user_data_by_name," etc. (sometimes there're good reasons for that, such as being backward compatible with legacy code), you could try to see if you can design in such a way that you can generate the methods as automatically as possible, such as (not a direct example but it shows the idea):

    #! /usr/local/bin/perl -w use strict ; # ########################################################### # Schema: Survey Database my %Survey = ( tblq => { primID => ['q_id'], cols => ['anstmpl_id', 'short_label', 'question' +, 'updated'] }, tblqtmpl => { primID => ['qtmpl_id'], cols => ['name', 'descr', 'updated'] }, tblqtmpl_q => { primID => ['qtmpl_id', 'q_id'], cols => ['seq', 'notes', 'updated'] }, ); # ########################################################### # Package: Survey Database {package Survey ; # assume %Survey defined somewhere # = = = = = = = = = = = Contructor = = = = = = = = = = # sub dbconnect { } # = = = = = = = = = = Public Methods = = = = = = = = # foreach my $tbl (keys %Survey){ # define subroutines <action>_<table> at runtime eval qq/sub insert_$tbl { return (shift)->_insert_table(_tblname(),\@_)}/ ; eval qq/sub update_$tbl { return (shift)->_update_table(_tblname(),\@_)}/ ; eval qq/sub delete_$tbl { return (shift)->_delete_table(_tblname(),\@_)}/ ; } # = = = = = = = = = = = Private Methods = = = = = = = =# sub _insert_table { my ($self, $tbl, $values) = (shift, shift, shift) ; # ... and more... } sub _update_table { my ($self, $tbl, $values) = (shift, shift, shift) ; # ... and more... } sub _delete_table { my ($self, $tbl, $values) = (shift, shift, shift) ; # ... and more... } # ----------------------------------------------------- sub _tblname{ # return <tbl>, if called by <action>_<tbl> # e.g. show_tblanstype ( my $sub = (caller(1))[3] ) =~ s/.*::.*?_(.*)/$1/ ; return $sub ; } } # ######################################################## # Test Script: Survey Database Package # tblq $Survey->insert_tblq(\%data) ; $Survey->update_tblq(\%data) ; $Survey->delete_tblq(\%data) ; # tblqtmpl $Survey->insert_tblqtmpl(\%data) ; $Survey->update_tblqtmpl(\%data) ; $Survey->delete_tblqtmpl(\%data) ; # tblqtmpl_q $Survey->insert_tblqtmpl_q(\%data) ; $Survey->update_tblqtmpl_q(\%data) ; $Survey->delete_tblqtmpl_q(\%data) ;

    Update: In theory, all your specific SQL statements and stored procedure calls go to MyAppTable and nowhere else. MyDB might have few or no SQL statements at all.

Re: Abstracting sql
by Abigail-II (Bishop) on Jul 31, 2003 at 08:29 UTC
    It seems to be that any Perl programmer at least once in his life either has to write an HTML template system, or a database abstraction layer.

    Please listen to what Michael Peppler said earlier in this thread. He's absolutely right.


      It might help to note that mpeppler happens to also be the author and maintainer of nearly every reputable CPAN module that deals with the Sybase RDBMS (and, because Microsoft buys all of its software first, MS SQL*Server). This includes DBD::Sybase and sybperl. (Read his homenode for more info.)

      So, regardless of whether you think his momma dresses him funny or not, he has a few legs to stand on when he talks about best practices for dealing with RDBMS-interaction.

      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: Abstracting sql
by nite_man (Deacon) on Jul 31, 2003 at 07:22 UTC

    In our project we use yourself DBI abstraction (it's not perfect solution but it satisfys our requirements). There are implemented some methods for storing and retriving data from database (as you suggest). But some time ago I've found a good DBI abstraction for build SQL query - DBIx::SearchBuilder. I'm planning to use it because it's flexible and multipurpose tool.

    Hope that DBIx::SearchBuilder will be useful for you.

    _ _ _ _ _ _
      M i c h a e l

Re: Abstracting sql
by l3nz (Friar) on Jul 31, 2003 at 16:38 UTC
    I would not really care about the SQL part of it, but I'd write a general swappable data access module. Something I frankly don't like is how the use of realtional databases tends to make developers think in terms of tables and SQL for every problem. A swappable data access module will do the loading and saving at a higher level, and allow for any weirdness in the actual storage used - be it a database, text files or whatever. On the other side, such a module requires a clear view of your problem, while inserting SQL queries there and here in your code is quick and dirty to do.
      Yeah, thats actually more or less what I was thinking, you just put it in much clearer terms =]. I think I was using "sql" as more or less a metaphor for any "Specific Data Acessing method", but sql was shorter and came to mind. But yes, a prime factor of my question (I'm not sure how clear I made this) was to attempt to be as data source agnostic as possible. A clear view of my problem should be no problem, I'm trying to spend as much time on design as I can, I'm tired of randomly hacking out programs and running in to road blocks half way in =/.
Re: Abstracting sql
by Mur (Pilgrim) on Jul 31, 2003 at 18:00 UTC
    Re: your first idea of writing database accessing functions. I did just this, coming up with a "Table" module that allows this sort of syntax:
    use Nexcerpt::Table qw(names cities states); my $Michigan = states->get({ abbrev => 'MI' }); $Michigan->put({ timezone => 'Eastern' }); my $Puerto_Rico = states->new({ abbrev => 'PR', name => 'Puerto Rico', ... }); $Puerto_Rico->put; print $Puerto_Rico->{sequence}; # fields auto-updated on put my @homes = cities->getall('WHERE name = ?','Springfield'); etc.
    I did all this before coming across Class::DBI, which might have rendered this moot, but I don't regret it because I learned so much about OOP I didn't know before. (I'm currently unable to share the whole code base of this; I hope I can someday. Meantime, I'm happy to discuss it via email.)
    Jeff Boes
    Database Engineer
    Nexcerpt, Inc.
    vox 269.226.9550 ext 24
    fax 269.349.9076
    ...Nexcerpt...Connecting People With Expertise
Re: Abstracting sql
by CountZero (Bishop) on Jul 31, 2003 at 19:49 UTC

    What's wrong with having bits of SQL strewn around your code? It seems to be the same as saying that it is messy and wrong to have bits and pieces of regexes left and right in your program and that it would be better to have it abstracted away into some OO-package.

    Whatever you do, somewhere in your program you will have SQL-statements to run the queries on your database. Maybe they are hidden inside Class::DBI or someplace else, but they are still there and I like to have them handy nearby so I can tinker with them as necessary. Of course, if you happen to use two times or more very similar SQL in your program, you better turn it into a subroutine which with the use of some well-chosen parameters can go very far in avoiding to having to type (almost) the same SQL-code again and again.


    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      >> It seems to be the same as saying that it is messy and wrong to have bits and pieces of regexes left [...]
      The key difference here is that regexen are not generally considered to be a seperate language, where as SQL quite definately is. It's fairly common to expect any perl programmer to be at least vaugely familiar with regexen but not necessairly familiar with SQL.

      Three advantages I can think of off the top of my head are:
      1. Familiarity - I can have someone who specializes in databases review my sql to make sure it is well formed and optimized without having to much with excessive amounts of perl, where as the perl programmer who doesn't know sql can just as easily only focus on the perl parts, as all he has to do is call some function if he wants data.
      2. Modularity - I could easily reuse sql statements and so on, a change in one place can instantly change everything that relies on this piece of code
      3. Portability - it becomes much easier to write applications to work on multiple databases when the SQL, the actual code that interfaces with the database, is hidden behind one or more abstraction layers.

        Absolutely. In addition, it makes applying schema changes to the database much easier, because you know that the SQL that may be affected is all in one place.


        To each his/her own experience:

        1. Familiarity: I write as easy SQL as I do Perl, so there is no benefit in "sub-contracting" the SQL side to someone else.
        2. Modularity: You can of course only change the internals of your DB-access modules as the interface to the rest of your program must remain the same. Most of the time I find that I need to add some extra bit of functionality, which would break the interface anyhow (or make it horrendeously complicated to enable it to remain backward compatible for all things which depend on it already).
        3. Portability: Even with something like Class::DBI you have to set-up a lot of sub-classes for each database, so I hardly see the benefit of going through all that work for each database. I write as quickly my own code (with liberal use of the good ol' copy'n'paste).
        Which shows once again that there is no solution which fits and suits everybody. And that's the way it should be!


        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Abstracting sql
by princepawn (Parson) on Jul 31, 2003 at 17:12 UTC
    My second option seems to me to just write all sql I would need and store them in functions This is what DBIx::AnyDBD does.

    Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://279495]
Approved by mpeppler
Front-paged by Courage
[atcroft]: Is there an example somewhere of a way to order nodes when using Tree::DAG_Node?

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2018-04-21 00:42 GMT
Find Nodes?
    Voting Booth?