Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

Say I have the following normalised structure:
TABLE 'People': NAME ID --------- Alice 1 Bob 2 TABLE 'Qualifications': QUAL ID --------- MA 1 MSc 2 PhD 3 TABLE 'people_who_have_qualifications': PERSON QUAL 1 1 # Alice has an MA 1 3 # Alice also has a PhD 2 2 # Bob has an MSc.
Now say I have a web interface to edit/update those relationships, which brings all of a person's details into one form.

When Bob gets his PhD or I realise Alice's MA was a data entry error and should be a BA, I need to edit their record and save in a way that updates that 'people_who_have_qualifications' table.

The simplest way to do that, it seems to me, is to just delete everything and then re-enter it. First pass, Alice has no qualifications at all. Then, second pass, we read off the CGI param()s we got from our checkboxes and put her qualifications back in.

Now, it's kind of inefficient, because every time we edit Alice's record, we're deleting everything and re-inserting it. And of course if the network/db server/script hits a problem between the first and second pass we've got missing data.

But if I don't do that I have to tiresomely look up and compare every value already in the db with the values in the CGI param()s. "Is this value there in the db and in the form? Leave it alone. Is it there in the db but not in the form? Delete it. Is it in the form but not in the db? Insert it." Lots more database activity in total, lots more scripting, etc.

I guess what I'm asking is, Is the "delete all, re-insert" method good laziness, la Larry, or bad?

Nobody says perl looks like line-noise any more
kids today don't know what line-noise IS ...

Replies are listed 'Best First'.
Re: OT: updating database question
by revdiablo (Prior) on Jul 22, 2007 at 07:58 UTC

    I'd opt for another approach than the two you outlined. I would change the form to submit imperative commands that match what the user actually did, rather than a copy of what the end result should be.

    By way of example, if you want to remove Alice's MA, you check a "delete this qualification" box. Your form is then telling the handler that you want to delete the MA, and you don't have to do a tedious backend comparison, nor the expensive rebuild everything from scratch.

      This also has the added benefit of differentiating why the delete took place. At some point in the future, the user may require an audit trail and will want to know that a new degree was added or that a mistake was corrected.
Re: OT: updating database question
by Conrad.Irwin (Novice) on Jul 22, 2007 at 20:27 UTC
    With the effort you have gone to to normalise your data, I dont think that the laziness of rewriting your data each time is good laziness. It could indeed be dangerous - because a user could accidentally change the data on the form and submit an error. By rearranging your form so that people input changes only, perhaps by using a dropdown to select the old name and new qualification, or a text box to enter a new name and a new qualification. If you are using SQL then you could update it will an UPDATE query something like  UPDATE person_qual SET qual = (SELECT id FROM qual_id WHERE qual = 'MA') WHERE person IN (SELECT id FROM name_id WHERE name = 'Alice'); If you are manually implementing your own database with text files, you will probably have to rewrite the file each time, though you could perhaps just append a line each time - thus allowing you to undo changes easily.
Re: OT: updating database question
by graff (Chancellor) on Jul 23, 2007 at 02:58 UTC
    Maybe this wouldn't be absolutely necessary, but it might save you some effort if you add an auto-increment row-ID field on the "people_who_have_qualifications" table. That way, when a user asks to see all the qualifications for Alice and might decide to make changes, you get a set of distinct row-ID handles for the records that currently exist as you provide these rows for review.

    The user can then select one or more particular rows for update or deletion, and you can do the specified update or deletion directly on the selected rows, using the row-ID that you've kept track of in the UI. (The user doesn't need to see the row-ID -- you just need to keep track of it.)


      Ack! Surrogate keys gone wild!

      Seriously. My first thought is to wonder why the main tables use surrogate keys at all. In any case, adding a surrogate key to the third table doesn't necessarily add value.

      If you want, say, to see all the qualifications Alice holds, you simply constrain the query to get them. The combination of name and qualification uniquely identifies the row already -- no surrogate key needed, along with its care and feeding.

Re: OT: updating database question
by Cody Pendant (Prior) on Jul 23, 2007 at 23:37 UTC
    Thanks for all the interesting responses so far.

    It's made me see, at the very least, that my main focus in this project has been combining every possible action into one page/form, centred on a person. I had a strong negative reaction to the idea of a separate "edit this person's qualifications" form -- I want it all in one place! So if nothing else, this thread has helped me define my thinking.

    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
Re: OT: updating database question
by poqui (Deacon) on Jul 23, 2007 at 21:18 UTC
    I think your problem could be solved by modeling time in your model.

    There is something called Data Vault that I ran across a couple of years ago, and it handles not only time quite well, but changes to the data model as well.

    It gives you the facility to capture the relationship as it existed at a time, and you can add a "reason" also, to track why it changed.