Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI

by princepawn (Parson)
on Nov 28, 2000 at 23:38 UTC ( [id://43748]=perlquestion: print w/replies, xml ) Need Help??

princepawn has asked for the wisdom of the Perl Monks concerning the following question:

In terms of returned datatype, DBI supports three ways of returning data:
@row_ary = $sth->fetchrow_array; $ary_ref = $sth->fetchrow_arrayref; $hash_ref = $sth->fetchrow_hashref;

For the purposes of this discussion, it is a pragmatic simplification to say that you can either get an array ref or hash ref back for each row of a table.

When database fields are removed or switched in position, I call these destructive or positional field mutations. It is clear that Perl code based on array ref fetches will break in the face of such table mutations. On the other hand, hash ref fetches will not fail in this case.

When database field names are changed, I call this a labeling mutation. In this case, array ref fetches are robust to the mutation while hash ref fetches will break.

So, the dilemma which pops up when deciding on how to structure one's objects (as hashrefs or array refs) again rears its ugly head when fetching data.

And so the topic of discussion is: How does one write Perl code such that it can work without modification in the face of labeling, destructive, or positional field mutations?

Replies are listed 'Best First'.
Re (tilly) 1: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by tilly (Archbishop) on Nov 29, 2000 at 00:51 UTC
    You know, DWIM is nice and all, but at some point you need to step back and accept that some things carry real meaning and should not be changed lightly.

    Positions move around dynamically. Naming things is more robust to future changes than remembering positions. Hence the value of hashes.

    Beyond that if you want more flexibility in the face of changes to names, you need to add a level of indirection to the naming. An indirection layer, whether it is a hash lookup in your code from external names to internal names used in code (so you only need to change one thing to external changes), or some sort of meta-table from which you drive other tables, gives you more protection. Remember that adding a level of indirection can solve all problems except the problem of having too many layers of indirection!

    However worrying about piling layers upon layers of indirection on right away, before you have concrete examples to work with, is a warning sign for a design process run amock with people who probably should not be designing. (For some reason OO lends itself far too naturally to this sort of problem.) You need to strike a balance between practicality and theory. And you need to accept that it is the nature of the beast that often good generalizations can only become apparent in the light of practical experience.

    You know, good design is not about following some abstract rules. Polymorphism isn't good in and of itself. Abstraction is not an ideal goal. Modularity is not the measure of all code. Instead the rules are abstractions of lessons learned by hard experience about things that have practical value.

    One thing that I see about you is this tendancy towards memorized facts. You seem to have little understanding of principles. Sure, eliminating repetitive typing is good. Do you understand why? Do you understand why, despite that principle, it is better to put all of your functions in @EXPORT_OK instead of @EXPORT and force people to repetitively type what functions they want? (So much so that Exporter's documentation specifically recommends that?)

    You mentioned once that you don't like living with your own designs. Having seen the questions you ask, I can understand why you would not. Take this question. You want to produce the ultimate polymorphic doo-dad. Why? I suspect that you would be better off putting a layer of indirection (say through a hash that you eventually populate from a lookup table) between the names in the tables and the labels on the front end, then try to choose good names for both. If the labels change, you have to retype something. Minor ouch. Most changes will be pretty easy to make though, so it isn't a big deal. (Make sure it crashes pretty obviously on bad names, best to find that oops quickly.) But if someone else wants to read your code, they will be able to figure out what is going on. This is a Really Big Deal. And most importantly, you won't get into a situation where someone makes a change, the program guesses what they meant and gets it wrong, and nobody finds out for 6 months. Do you understand why this is important to prevent?

    OK, I won't continue beating around the point. My point is that I would design this with a layer of indirection so that your code can be easily altered to handle a new world order, but should fall apart in a very noticable way if names change and no human intervenes. IMNSHO the way you asked your question shows a lack of awareness of basic principles of what I would call "good taste".

Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by davorg (Chancellor) on Nov 29, 2000 at 00:41 UTC

    Surely this is only a problem if you use select * from tab1 and I can't believe that anyone would be stupid enough to do that in production code.

    For exactly the reason that you discuss, you should always use select col1, col2 from tab1

    The problem is not with Perl or DBI, but with your usage of SQL.

    --
    <http://www.dave.org.uk>

    "Perl makes the fun jobs fun
    and the boring jobs bearable" - me

Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by Dominus (Parson) on Nov 29, 2000 at 00:08 UTC
    Says princepawn:
    > How does one write Perl code such that it can work
    > without modification in the face of labeling,
    > destructive, or positional field mutations?
    It's going to take a lot of effort to do it in Perl or in any other language. Here's a small table with only two columns:
    ID1 ID2 ----------- 119 666 437 112
    Now switch the order of the two columns, and change the names so that ID1 is now called ID2 and ID2 is now called ID1:
    ID1 ID2 ----------- 666 119 112 437
    Now how are you going to write a program that magically copes with that? This is not a Perl problem; it's an impossible problem. Even looking at the database schema is not going to work.

    The only solution is for the database itself to carry its own description. One way is to install a special row into each table:

    ID1 ID2 ----------- "id2" "id1" ... ...
    then the program can select this row, examine it to see whether the column names or the order have changed, and it can make the appropriate allowances.
Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by extremely (Priest) on Nov 29, 2000 at 07:02 UTC
    OK, from the DBAdmin I used to work with, here are the rules:
    • In a select, "*" had better have count() around it.
    • If you change a field to be more restrictive, you had better be sure your code checks for insert errors.
    • If you make a field less restrictive, your select code had better be robust.
    • Change the order of the fields? WHY? It's a database, not a text file!
    • If you change the name of a field or delete one, change the name of the table. You can leave a view behind if you don't want to fix your code.

    Really that last one is the real answer. Write your code to views, not tables.

    If the tables are still in flux, what are you doing writing production level code? Worries about posistion are naive, views and clean selects fix that at the database level. Your code depends on the DB, if you can't trust the DB to behave, you are sunk, period.

    The short answer to your problems is: select from views and with field names, don't try to fix your DB mistakes with code, forget what order your fields are in, finish the DB design before you write final code.

    You might also try factoring out the DB access into a nice local module so you can move common actions into a central place where you can change them just once. That way, at least you can protect the rest of your code from the changes by interposing a code layer.

    --
    $you = new YOU;
    honk() if $you->love(perl)

Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by runrig (Abbot) on Nov 28, 2000 at 23:48 UTC
    When database fields are removed or switched in position, I call these destructive or positional field mutations. It is clear that Perl code based on array ref fetches will break in the face of such table mutations. On the other hand, hash ref fetches will not fail in this case.
    First, switching field positions in a table is usually not a good idea, but should only be a problem when you 'select *' instead of selecting explicit fields. Some say that you should always explicitly select fields, and for efficiency, you should only be selecting the fields that you need anyway. I think selecting into a hashref is still slow, so I'd only do that if speed was not an issue.
      First, switching field positions in a table is usually not a good idea

      I agree that it may not be a good idea but in the prototyping stages of a project anything including switching field positions can happen.

      And the idea is to hope that the code itself can remain relatively invariant in the face of the design changes that can occur almost daily when a new project is underway.

        But if you select by fields rather than by * as he said, changing the order in the table shouldn't affect your code at all.
Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by chipmunk (Parson) on Nov 28, 2000 at 23:44 UTC
    I think the more apropos question would be:

    How does one write SQL statements such that they can work without modification in the face of labeling, destructive, or positional field mutations?

    If you can do that, I would argue that you don't have to worry about the Perl/DBI code.

      Yes, because it's probably working by mind control.

      Uh, more seriously, the only way that you could reliably (and this probably needs several levels of qualification) avoid this breaking your code would be to interrogate the database prior to each select (and this still wouldn't save you in the event that the database is changed in the time between your first 'interrogation' select and your second 'data' select).

      What you'd need to do is either:

      1. Use DBI to return a list of the table's field names (this would help you determine if your hashref query would break) -- I'm not sure if this is even possible, but the $dbh->table_info looks like it might offer something of value
      2. Create a table of 'meta' information -- one whose field names might be something along the lines of table_name, column_name, column_type and so on. If you are writing a complete interface for an extensible system then this would probably be a much better way to go because you can force people to use your scripts to modify the tables and, at the same time, update the meta information in the meta table. That way, your scripts could turn to a reliable source for information about the tables they are querying while still permitting the tables to change.

      Obviously, I lean towards the latter solution, but it requires that you be able to restrict modifications to the tables to the tools that you provide.

        You could use idea 1 and use the sql statement 'describe table' then you could checksum that output and if the checksum changes, die and don't run until the program is updated. This could apply if other people are likely to change column names and you don't want your program to try to run in an unknown situation.

        Assuming you have already prepared and executed statement handle $sth, $sth->{NAME} will return an array ref of the column names, so

        my @columns = @{$sth->{NAME}};

        gives you an array of your column names.

        --------------
        www.theduttons.com
Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by quidity (Pilgrim) on Nov 29, 2000 at 00:20 UTC

    If you expect to have to access a database with colums that change then it's time for you to embrace a buzzword or two.

    Essentially you need to tuck away all the bits of your code which actually talk to the database. Make a module with an OO interface, ensure that remains constant in time, then any changes in the database structure (or even the method of actually storing the data) become moot.

    Then to get at some data all you'd need to do is:

    @values = $abstracted_db->beans_with_colour('red');

    Now, doing that might take a bit of thought, but in the end it will pay off. The main drawback is that it slows you down a bit, but when you talk to databases, you're going to be fairly slow anyway.

Re: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI
by wardk (Deacon) on Nov 29, 2000 at 01:02 UTC

    As my Oracle and SQL Server DBA's have told me...Do not assume that column positions will ever be correct. This is especially good advice with INSERTS where assuming position will just flat out get you in trouble eventually, especially if the table structure is known to be dynamic (Simply adding a new column will void a working positional insert). In fact I believe SQL Server is guaranteed to NOT be reliably accessible by column "position".

    As far as these type of changes. It's not very often that "labels" are changed on any real database system I've ever dealt with. At least not after they've gone into Production, and not without notice to the development teams using them.

    Hopefully you are not at the mercy of someone modifying your tables without the benefit of at least an informal "heads up".

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://43748]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-28 17:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found