Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

How to structure applications using a RDBMS

by dragonchild (Archbishop)
on Jul 18, 2002 at 19:47 UTC ( #183012=perlquestion: print w/ replies, xml ) Need Help??
dragonchild has asked for the wisdom of the Perl Monks concerning the following question:

(This is more of a general design question vs. a specifically Perl question.)

I'm trying to work on decoupling the application I'm working on from its database. The goals here are:

  • Minimize the impact of schema changes on the business logic as well as the control structure
  • Bring together all SQL so that the business and controller logic aren't dealing with the underlying implementation of how the data is stored.

My problems are thus:

  1. I know that I want to bring all the SQL together, so having every SQL call wrapped in a function is easy enough.
  2. But, I know I'll end up with a ton of individual SQL_ functions with very little way to navigate through them.
  3. At first glance, every single SQL call will be in some SQL_ wrapper, even if there's a minor change (such as sorting and the like)

Have people worked out some idea of data structures to pass back and forth? I'm not so worried about memory or CPU as I am in cohesion and coupling.

We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Comment on How to structure applications using a RDBMS
Re: How to structure applications using a RDBMS
by runrig (Abbot) on Jul 18, 2002 at 19:56 UTC
Re: How to structure applications using a RDBMS
by mpeppler (Vicar) on Jul 18, 2002 at 21:59 UTC
    Personally I like to define logical operations on the data, write perl subroutines that map to these operations, and then do whatever database manipulation that needs to happen from these subroutines.

    As most of my work is with Sybase databases there is often a one-to-one mapping between these logical operations and a Sybase stored procedure, in which case there is no explicit SQL code in the application - only a call to a stored procedure, which adds an additional layer of abstraction.

    This may not be the best solution in all cases, but it's what I've found easiest to maintain.


      Further extending your point, I would suggest separating data and functionality into chunks that could be represented by a Perl object. These objects will hide the complexity of RDBMS interaction (SQL etc). Objects could then be related and structured in a meaningful way for optimal performance and minimal maintenance cost.

      Actually, I'm just about to write a system (in Perl) for artist/artwork and article management. In the back-end there's a MySQL database where data about various artists, artwork and related articles is being kept. The task is to come up with web tool that would help clients to access and manipulate this data.

      For this to work, I will first design Perl objects to represent an individual artist, artwork and article. These objects will also implement interface for changing/updating data (in the MySQL database) that is encapsulated in each object. Finally, I'll simply write a template driven CGI script that will make use of such objects....

      # Under Construction
Re: How to structure applications using a RDBMS
by Ovid (Cardinal) on Jul 18, 2002 at 22:57 UTC

    I'm not entirely sure that I understand what you're asking, but I'll answer what I think you meant.

    For every program, we basically have two things: things that need to be done and the data that it's done to. Things that need to be done (procedures) are like:

    • Get orders for given month and year
    • Sum orders and generate inventory report based on orders
    • Output order totals report and inventory totals report

    The "procedural" type of stuff goes into procedural code. However, creating objects as wrappers around all data access can be convenient. For example, you don't return an order as a hashref, but as an object (ignore the names, this is slapped together):

    my $order_obj = Foo::Order->new; my $orders = $order_obj->by_date( $date ); foreach my $order ( @$orders ) { my $items = $order->items; # arrayref of item objects my $total = $order->total; # returns a value # do more stuff }

    Now, you can rearrange your procedures all you want and not worry about the data. Also, you can change the internals of the objects all you want as long as you keep the API. If you want to minimize the impact of schema changes on the business logic, this will eliminate the impact of schema changes. Update your object code and ensure that the API doesn't change.

    This has another advantage people don't think about. If you force the programmer to use accessors any time you return anything other than a scalar (and mutators for virtualy all data) you have a single point of control for the data. If you realize that a particular piece of data must never be set to negative three, by forcing the programmer to use your mutator, you have one and only one spot that you have to worry about. Giving them direct access to the data means there are probably multiple places that you need to check.

    Further, you can put the SQL in the objects themselves or push the sql down into another layer so that even your objects don't connect to the database directly. This is a great refactoring. You can define some consistent ways for individual objects to identify their tables, fields, and keys and you can write generic routines like this:

    sub delete_me { my $self = shift; my $sql = "DELETE FROM $self->{_table} WHERE $self->{_id_name} = ?"; return $self->{_dbh}->execute( $sql, $self->{_id_value} ); }

    You can abstract away a lot of code with this, but you will almost always have special cases. It ain't perfect :)

    Of course, this is great if you have black box unit testing. Update your objects to your heart's content, run the tests and you know your API hasn't changed and any procedural code that's dependant on the API is much more likely to work without changes.

    Sorry if this seemed pedantic :)


    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: How to structure applications using a RDBMS
by gav^ (Curate) on Jul 19, 2002 at 01:10 UTC
    You might want to look at two cunning modules by Michael Schwern/Tony Bowden, Ima::DBI and Class::DBI. The first gives all your classes nice cached DBI handles and SQL statements. In your class you set up your SQL statements:
    My::User->set_sql( 'getUsers', 'select * from users', 'Main' );
    Which you can then retrieve in anything that uses the class:
    my $sth = My::User->sql_getUsers; $sth->execute(); # etc
    I like this approach because all the SQL lives in a seperate module and doesn't mess up your main application. It also has some handy shortcuts like:
    $sth->execute([qw/value1 value2/], [\my $res1]); my %hash = $sth->fetch_hash;

    Now Class::DBI is a great abstraction layer if, like me, you don't like to play with SQL at all. An example:

    package User; use base 'Class::DBI'; User->table('users'); User->columns('All', qw/id name password/); User->coumns('Primary', 'id'); User->set_db('Main', 'dbi:mysql', 'gav', 'perlmonks');
    Now like magic you have a class to talk to the database and hide you from the horrors of SQL:
    my $user1 = User->new({ name => 'gav', password => 'monk' }); my $user2 = User->retrieve(1023); printf "Id: %d, Name: %s, Pass %s\n", $user2->id, $user2->name, $user2->password;
    my $user = User->search(name => 'gav'); if ($user && ($user = $user->next) && $user->password eq 'monk') { print "Welcome gav^!\n"; } else { print "Schoo!\n"; }

    This came out a bit longer than I expected, but I hope it might help somebody :)


Re: How to structure applications using a RDBMS
by toma (Vicar) on Jul 19, 2002 at 07:58 UTC
    One way to reduce the number of SQL functions is to model the database schema in the database.

    This approach uses tables to model the database table names, column names, datatypes, relationships, and the ordinality of the relationships.

    This approach allows many queries to be automatically generated. Instead of using a lot of little wrapper functions, AUTOLOAD a function that performs a query on the schema, generates SQL code, and runs the query.

    I haven't tried DbFramework but it looks like it does at least some of this and possibly much more.

    There is an article about this approach, although it describes a Java implementation of the idea.

    I used this approach on a database that has about 500 fully normalized tables. I use a GUI that is generated on the fly from the database metadata. There is also a database schema browser that is driven from the metadata. I would use this approach in the design of databases that have more than about 50 or 100 tables.

    It should work perfectly the first time! - toma

Re: How to structure applications using a RDBMS
by rdfield (Priest) on Jul 19, 2002 at 08:47 UTC
    To give a slightly different perspective, here's some thoughts from the DBA side of things.

    Programming Pros

    • It's nice from the programming point of view to have automatic code generators that do things to remove you from the SQL
    • Encapsulating business logic in Perl (or whatever language takes your fancy) modules makes things nice and easy for this project
    • Abstracting the SQL to a completely different module away from our nicely structured Perl keeps things looking nice and neat

    DBA Cons

    • Unfortunatly this kind of code generation leads to stuctures like:
      my $sth1 = MYSQLMODULE::generate_result_set(TABLE => "employees", COLUMNS => qw(name positio +n salary id), WHERE => "department = 'SA +LES'"); $sth1->execute(); while (my @employee = $sth1->fetchrow_array) { my $sth2 = MYSQLMODULE::generate_result_set(TABLE => "commission", COLUMNS => qw(total_com +mision), WHERE => "id = $employe +e[2]"); $sth2->execute(); my @commission = $sth2->fetchrow_array(); print "$employee[0]($employee[1]) = $commission[0]\n"; } ...etc...
      which is highly inefficient, and should be replaced with a single SQL statement with a join.
    • It doesn't matter which language you code your business logic in, in 2 years time the company will be using something else and someone will have to re-write the business logic in todays latest fad language. Best off to write it in the database's stored procedure language and let everybody, no matter what language they code their appilications in, be subject to the same, consistent, business rules. If the rules change, the code only has to be changed in one place.
    • Programmers don't necessarily understand (or even care) about performance issues on a particular RDBMS. Nor should they - it's not their problem, that's what the DBA gets paid for. A DBA having to wade through yards of code in Perl, C, C++, Java, Cobol, Fortran - on the same day! - is not a lot of fun (challenging, but not a lot of fun), to find an offending piece of SQL, and then trying to get the correct programmer to check-out, edit, test, check-in, rebuild, deploy, etc the new application whilst production is on its knees and a swarm of managers is hovering with intent is, again, not a lot of fun (stressful, but not a lot of fun). Again - encapulate the SQL in the database's stored procedures and the DBA will let you sleep in your cubicle.

    I haven't adhered to these rules in the past, for instance - not using stored procedures because I want to make to application as portable as possible, but then having to make the SQL highly optimised for a particular platform to get any sort of performance, means that the application is unable to be easily ported anyway. Think of SQL the same way as everybody in PM thinks of HTML (replacing the Web Designer with the DBA, of course) - get it out of your application, so you don't have to worry about it (make it SEP). If you haven't got a DBA up to the job, just /msg me... :)


      "All three of your points are well-taken," says a DBA who spends more time writing Perl than doing anything else, "but point three:

      * Programmers don't necessarily understand (or even care) about performance issues on a particular RDBMS. Nor should they - it's not their problem, that's what the DBA gets paid for

      is both true and not true. It's true that programmers don't always understand how the RDBMSs (we have Teradata, Oracle, RedBrick, and Sybase in production and create a lot of MS Access items for people's desktops, and our developers have to work across them) they use work internally, but they should care and should be willing to learn. (They also should listen to the DBA when they don't know, but that's another rant.)"


      They laughed at Joan of Arc, but she went right ahead and built it. --Gracie Allen

        My point was specifically about each RDBMS' peculiarity when it comes to writing SQL for performance rather than functionality. I would expect a good programmer to have a broad knowledge of what functionality is available for each RDBMS they use on a regular basis, but I certainly wouldn't expect such a programmer to go to the lengths of understanding how each RDBMS optimises their SQL. For instance, I don't know DB2 at all, but I have it on very good authority that its optimizer changes with every release - keeping track of such things is surely not a job for a programmer. For another instance, a few years ago I was was called in to a large company to fix the problems that they were having with their Telephone Banking system - their daily batch update was taking 27 hours! (do the maths :) A couple of small tweaks later (changed an 'or' to a 'union all' and modified the column ordering in a couple of indexes) and the batch load time was reduced to 42 minutes. The point of that little story was that the original SQL was written by Oracle Corp consultants charging GBP150p.h. - programmers who should know how the Oracle RDBMS works :)


by simeon2000 (Monk) on Jul 19, 2002 at 13:06 UTC
    My personal favorite DB application design model is OOP. I usually end up making an object with the same field names as the database table to represent each object.

    pseudo: package Project::User; my @FIELDS = qw /name age sex passwd/; sub new { my $type = shift; my $dar = shift or return undef; # data array ref, usually # a fetchrow_arrayref if (ref $dar = array) { # they already pulled the data, we're just holding # it for display or an update } elsif ($dar =~ /^\d+$/) { # if it's a unique ID, pull the data from the db } else { # it's just empty, we're empty } my %init = (); my @init{ @FIELDS } = @$dar; } sub pullFromDB { my $self = shift; my $uid = shift or return undef; # sql code to pull from DB } sub dbSync { my $self = shift; # db commit stuff here } sub assignArgs { my $self = shift; # utility function to assign data values to object properties }
    ... ok that's a bit much to write out, but I hope it presents the basic ideas I've used. Usually the object is used like

    $user = new Project::User($uid); or $user = new Project::User($sql->fetchrow_arrayref); It makes it very easy to operate on the front-end. Or perhaps it's just needlessly complex :P

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://183012]
Approved by epoptai
Front-paged by RhetTbull
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2015-01-25 12:56 GMT
Find Nodes?
    Voting Booth?

    My top resolution in 2015 is:

    Results (182 votes), past polls