|Perl: the Markov chain saw|
I've just spent a gigantic amount of time wrestling with a problem, and since this is similar to other problems I've had in the past, and must be something that other people experience, I wanted to share it. Perhaps it's totally obvious, in which case I apologize for boring anyone reading. Maybe it should be in my use.perl journal. But if I were a professional programmer, I'd want to turn to the guy next to me and say, "You know what the problem was? It was ---". But I'm not, and I can't, so here I am.
I do a lot of stuff relating to web-database interactions.
When I started learning, I wrote absolutely everything by hand, and cut-and-pasted all over the place. It was tedious, and I made lots of mistakes, and wasted a lot of time. But I also knew that everything worked exactly the way I wanted it to.
One of the first things I picked up from somewhere was that when you checked your form parameters, one thing you should do is make sure there's something there--if it's blank, or just spaces, skip it. So, from the very start, all my programs had something in them like this:
Eventually I wrapped this in a function, then as I learned more, a method, and in both cases I forgot entirely about it.
Recently I've been working on increasingly larger and more complicated projects, and the roll-your-own approach has been getting difficult. So I've started to use various systems for doing things--templating systems, database abstraction layers, etc. Which particular ones don't matter, as my point is not to criticize the way any one works.
In many ways these have made things much better--certainly a lot of things have become faster to get going. But I often find it very difficult getting things to work exactly as I want, and end up overriding much of whatever system I'm using. But on the whole I'm finding it an advantage, frustrating though can be.
But I just ran into a problem that had me frustrated for days. In one database, I have some columns that are has_a lookups to another table; I add/edit these with drop-down menus. Some can be blank, so I added a blank row to my select generation--<select name="format"><option></option>. To my frustration, this did not give me an empty column; in my display, I'd have the field displayed--"format", or whatever--but with no value. This only happened for these drop-down menus, never with blank textfields or anything else.
The problem, it finally turned out, was that the abstraction layer I was using, which has all sorts of features to handle data validation, does not throw out blank values, it just sends them right through to the database. So if you leave a field blank, "" gets sent to the database. And most of the time, "" gets inserted into the database. Now, I know there's a difference between "" and undef in Perl, just like I know there's a difference between "" and NULL in my database. But staring at my raw database records, the lines with nothing (instead of NULL) next to them just didn't look wrong.
Now, the reason I didn't notice it until I added the select lists: My blank textfields, which also had "" inserted in them, were almost always VARCHARs or TEXTs in the database, so inserting "" would put "" into the database. My View class did DWIM, and wouldn't display a line if the value were "".
The select menus, on the other hand, were all INT values (they're all has_a relations, holding the PK of the lookup table). And when you insert the "" into an INT field, you don't get "", you get 0. And that, when run through the View class, is what showed the problem--the 0 "existed", and thus I'd get a blank line in the display (blank because the 0 didn't correctly index the lookup table).
I ended up writing a line to preprocess my parameters before handing it off to the database layer:
Maybe if I were a better programmer, I would have thought of this earlier. Or maybe if I started in C. Perl usually DWIM, and when I have a blank textfield, I don't mean that I want "" inserted into my database, I mean that I want to leave the field alone.
I don't even know if there's a lesson here. Maybe my initial tip that you should check for "" before putting things into your DB was too thorough, or I internalized it so fully that the possibility of doing it any other way never came up. Maybe I should just have been more aware that Perl doesn't do this for you automatically. Maybe my abstraction layer should do this for you in its validation routines. Maybe I should write every line of everything myself, so I know that it's doing what I want. Maybe I'm just a moron, and should leave the programming to the programmers.
Update: While writing this, I realized a further problem that, incredibly, has never come up: I don't want to skip the field if it's an edit, rather than add, action. What if I have a field with a value in it that I want to remove? I can't blank it out, because then it will be blank, and my neat map above will just skip that field, and the original value will remain. On the other hand, if I instead map it to undef, I see that my abstraction layer will take it upon itself to skip the line! Rather than dealing this this now, I'm going to sleep.