http://www.perlmonks.org?node_id=319845

liz has asked for the wisdom of the Perl Monks concerning the following question:

This is not a Perl related question directly. But indirectly it might evolve into a very useful Perl module for users of DBI. Or maybe what I want exists already and have I just not been able to find it.

What I'm looking for is a Perl module that will take two SQL CREATE TABLE definitions (in the same dialect of SQL) and finds out whether they're different in functionality. If and only if they are different, then a method taking a database handle should make the necessary changes to the table in question in a transparent manner.

An example (in MySQL). Before:

CREATE TABLE foo ( id int unsigned primary key, name varchar(255) )

After:

CREATE TABLE foo ( id int unsigned primary key, name varchar(255), status tinyint unsigned )

Would execute:

ALTER TABLE foo ADD status tinyint unsigned

It seems that Alzabo is doing something like this under the covers. But I want to have this as a stand-alone module.

Class::AutoDB seems to offer some ideas in this direction as well, but claims the automatic update to be overkill for their application of the module.

Other modules that seem to go in this direction are DBIx-DBSchema, SQL::Schema and DB::Introspector. They all seem to be about abstracting table definitions, but not about updating actual table schema in the database itself.

Some of my questions:

Any feedback will be very much appreciated.

Liz

Replies are listed 'Best First'.
Re: Automatically altering tables from different schema's
by perrin (Chancellor) on Jan 08, 2004 at 16:42 UTC
    I think you should look at http://sqlfairy.sourceforge.net. It knows how to parse SQL into a data structure which should be fairly easy to compare, and can generate SQL from a data structure you build.
      I think I will need me to get SQL::Translator and start providing some patches ;-). Thanks for the tip, that was the module that I missed when doing my research into this area.

      Liz

Re: Automatically altering tables from different schema's
by mpeppler (Vicar) on Jan 08, 2004 at 16:41 UTC
    I haven't seen anything like that before, and there have been times when I could have definitely used something like it!

    If I were to write that sort of tool, I would probably try to use the table definitions as returned by the database servers themselves rather than a SQL script, as the database servers will return the data in a more regular fashion (easier to parse), and in some cases (e.g. Sybase) you don't have to parse anything at all, you just query the system tables and get the necessary data.

    Second, I would not go and directly update the target database. Instead I'd create a SQL script that has the ALTER TABLE statements so that it can be checked, and that the update operation is run in a controlled manner.

    Michael

      Instead I'd create a SQL script that has the ALTER TABLE statements so that it can be checked, and that the update operation is run in a controlled manner.

      I'm going for ease of use, so any changes would be "automatic" (with maybe the option of not deleting columns automatically). But there would be an API to get at the primitives, so that you can indeed do what you propose.

      Liz

        Liz,

        That is exactly what we do at my place of employment. We keep a text file with a list of tables that have changed. Our script gets the tables from source control (CVS) and then parses the table file (i.e. Table1.TAB) for column names. If the column name doesn't exist on the destination, an ATLER TABLE command is executed to add the missing columns.

        Our updates are done directly to the dB. If you do it directly to the dB like you want or you choose to output to a SQL script like mpeppler suggested, you'll probably want to make sure that either way the script(s) are re-runnable.

        It is not often that we DROP or modify columns. For me, this has to be done another way. Our DBMS repication can't handle column DROPs or UPDATEs without intervention.

Re: Automatically altering tables from different schema's
by VSarkiss (Monsignor) on Jan 08, 2004 at 17:50 UTC

    I can't give you any pointers to specific code, but I want to point out that the problem is more subtle than you may think. For example, take your second foo table. Is it different from this:

    create table foo ( status tinyint, id int primary key, name varchar(255) )
    As a human, you can see that the difference is mainly in column order, but also the id and status columns are not qualified as "unsigned". The question is, do these differences matter to you?

    If you're a programmer and you have well-written SQL, then column order in a table does not matter. On the other hand, if you're a DBA, and you're managing a million-row table with varchar(4000) columns, then column order does matter.

    It turns out that identifying the differences and overcoming them is mostly the same problem. Even if you can't make the change with an ALTER statement, you could always create a new table and load it from the old one, then rename the results. But the crux of the problem remains: what do you consider a significant difference? Once you've figured that out, the right tool to use will be easier to select.

    HTH

      I want to point out that the problem is more subtle than you may think
      Indeed. The difference between two versions of a table can be missing columns, which is fairly easy to manage, or more subtle things like NULL/NOT NULL, primary key, constraints, and a host of other things.

      Consider Sybase's CREATE TABLE syntax:

        I was going to go for MySQL and SQLite, because those are the itches that I need to scratch. Other will have to add their favourite database drivers themselves ;-)

        Liz

      ...the difference is mainly in column order...

      I was thinking of not considering these significant. Well, not in the initial version (I was thinking hashes rather than diffs ;-).

      ...but also the id and status columns are not qualified as "unsigned"...

      I would consider those significant. Database engines might just change a flag in the format instead of updating each field: that's their business.

      ...you could always create a new table and load it from the old one, then rename the results...

      That would be the database independent fallback. But since renaming a table is not standard SQL (is it, it would involve:

      1. creating a temporary table
      2. dumping the table data in there
      3. drop the original table
      4. create with new schema
      5. copy the data back over
      6. drop the temporary table

      Liz

Re: Automatically altering tables from different schema's
by Abigail-II (Bishop) on Jan 08, 2004 at 18:01 UTC
    Years ago, I made a program that compared database objects (not just tables), and reported the differences (column names, column types, order in which columns appeared, constraints, presence of triggers and indices, etc). But that was for Sybase, and there all the information is available in SQL tables, making it rather easy.

    I could also compare the SQL object creation code with what was present in the database. I did not attempt to parse the SQL using a Perl program - I simply loaded the object in an another database, and compared the newly created object with the object already there.

    Unfortunally, I don't have easy access to the code anymore.

    Abigail

Re: Automatically altering tables from different schema's
by Anonymous Monk on Jan 08, 2004 at 19:43 UTC
    If and only if they are different, then a method taking a database handle should make the necessary changes to the table in question in a transparent manner.
    Do you mean you don't want to make the table defintions idenfical if they are already identical? Why not just assume the table definitions are different. Once you have a perl script that can produce the SQL to recreate the table, safe off the data drop table and recreate it and then reload the data.
      ...Why not just assume the table definitions are different...

      I would let that be up to the developer: either she uses the current table definition, as retrieved from the database, as the "original", or an "original" definition that was stored somewhere else.

      The latter would allow you to keep hidden fields in a table, invisible from your application. This could be handy if you have two "views" on the same table.

      Liz

Re: Automatically altering tables from different schema's
by adrianh (Chancellor) on Jan 08, 2004 at 22:52 UTC

    Sounds like a darn fine idea :-) I'd certainly use it since I'm trying more agile database techniques these days.

    I'm with mpeppler and would prefer to grok the table information from the database rather than dealing with SQL statements directly.

    It would also be nice to easily be able to create the reverse operation of any set of changes so you can easily create rollback scripts.

Re: Automatically altering tables from different schema's
by sshfr (Beadle) on Jan 11, 2004 at 18:04 UTC
      Nice idea. Probably. Possibly.

      But I can't take a distribution seriously which:

      • is now over 3 years old at version 0.33
      • still has no documented API, not in POD, not on the website (at least that I could find)
      • hardly has any internal documentation
      • has 5 out of 6 test-failures listed with CPAN-testers.

      In short, MySQL::Diff has just dropped off of my shortlist of modules to check out.

      Liz

        you're totally damn right, i've just found this one today on CPAN and after a short look inside i think i'll not try use it...

        instead i go read SQL::Translator documentation ;-)