Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Defensive Programming

by George_Sherston (Vicar)
on Jan 14, 2002 at 18:53 UTC ( #138576=perlmeditation: print w/replies, xml ) Need Help??

In this thread I compared $hash{'key'} with $hash{key}, suggesting that the former was slightly superstitious. I still hold to the view that it's not something I should feel obsessively obliged to do. (And I don't! Free at last!) But I was interested to read nodes by other monks who felt it was a sensible precautionary step. In particular, I liked mstone's description of it as "defensive programming".

Just now I wrote the following code:
for (@$Courses) { if ($Course = CheckCourse($_)) { print PageCourseInfo($Course); $count ++; } else { last; } last if $count >= 10; }
Now, note the laste line in the loop. Why not just plain old last if $count == 10;? Is it ever possible for $count to be greater than ten without first being equal to ten? No. I can't think of any possible change I might make in the future to this code that wd mean that it would fail to exit with the simpler test.

But that doesn't mean there could never be. It's costless to replace == with >=, so I do it.

Now, why do I think of *this* as being defensive programming, whereas quoting hash keys feels like a superstition? Well, I don't really know. But one thought I have is that the hash key quoting defends against changes made to that line only. If I ever do alter the hash key to add some whitespace, or make it a variable or whatever, then I'll be right there. Nothing I do anywhere else can make any difference. On the other hand in my loop, there's a whole load of different things I could do, either by altering code at other places in the loop, or by cutting and pasting bits into other code, that could make my last test not work.

Perhaps a way of thinking about defensive programming is that it is about making every line or expression extensible. Not just loosely coupled functions, but loosely coupled conditions, evaluations and assignments, that can work in the widest possible range of circumstances.

I should be most interested to know other monks' practices when it comes to defensiveness. If it's costless and it makes the world a safer place, then why not?

George Sherston

Replies are listed 'Best First'.
Re: Defensive Programming
by demerphq (Chancellor) on Jan 14, 2002 at 19:52 UTC
    Hmm. I would say that there are much better defensive programming moves that you can make with your snippet.

    You posted

    for (@$Courses) { if ($Course = CheckCourse($_)) { print PageCourseInfo($Course); $count ++; } else { last; } last if $count >= 10; }
    Which I would probably not write in the same way at all. First you are using an assignment in a conditional. While this is not wrong personally I think its a bad call most times. Reason being that its easy to think that youve made the "=" instead of "==" typo. Second point would be that when writing an if/else block its usually a good idea to put the smaller block first, changing the condition as is necessary. Third instead of doing your >= you could have been even simpler and used <. Anyway, heres how I would have written the same thing
    for (@$Courses) { my $Course = CheckCourse($_); last unless $Course; print PageCourseInfo($Course); last unless ++$count < 10; }

    Yves / DeMerphq
    --
    When to use Prototypes?

      Rather than just saying "it's usually a good idea to put the smaller block first", I like to think of it as shortening the main-line logic. Whenever one side of an if-else winds up being a short-circuit exit (last, next, return, etc), I find it's much cleaner to just put that by itself to show that I'm figuring out how to do the short-circuit. Then, the else disappears and you've managed to reduce the indentation level by 1. In my book, that's always a Good Thing™.

      All of which is probably a fancy way of saying that the rewrite (above) looks good, I'm just disagreeing (maybe) on why it's better. :-)

        Oh I agree completely. I actually originally put a line like "and in perl you have even better options with the modifiers that are available" but it didnt quite fit so I took it out. Perl modifiers and control flow keywords are an excellent way to minimize excessive indentation and containment problems.

        Nevertheless I do think that "the smaller part of a conditional should go first" is a correct rule of thumb. If you have a 4 line block and a 40 block then the 4 liner should go first. This means the maintainer can see all of one and part of the other on one page, whereas with the long block first the maintainer can only see part of one block, and may not even notice the presence of the smaller one.

        Yves / DeMerphq
        --
        When to use Prototypes?

Re (tilly) 1: Defensive Programming
by tilly (Archbishop) on Jan 14, 2002 at 19:02 UTC
    In answer to your question, the answer is that it is quite possible.

    You are using global variables with generic names. The simple act of using strict.pm and lexically scoping variables is a more effective way of being defensive than making minor syntax changes which you think should make a difference.

Re: Defensive Programming
by dmmiller2k (Chaplain) on Jan 14, 2002 at 20:54 UTC
    "Why not just plain old last if $count == 10;? Is it ever possible for $count to be greater than ten without first being equal to ten?"

    Perhaps. Perhaps not. However, last if $count >= 10; (if used at the bottom of a much larger amount of code than in your example), communicates the intent far better than the '==' version, and is, as you say, more or less free.

    In the early days, we were far less P.C. We called it idiot-proofing (where the users were other programmers).

    Actually, I still think of it that way :).

    Update: ++ demerphq for succinctly rewriting your example code. Looks much closer to the way I'd write it.

    dmm

    If you GIVE a man a fish you feed him for a day
    But,
    TEACH him to fish and you feed him for a lifetime
Re: Defensive Programming
by dws (Chancellor) on Jan 15, 2002 at 02:53 UTC
    Why not just plain old last if $count == 10;? Is it ever possible for $count to be greater than ten without first being equal to ten? No. ... It's costless to replace == with >=, so I do it.

    Trade-off time.

    One problem with substituting >= for == is that you can mask catastrophic failures.

    Murphy being omnipresent, it is possible for an otherwise incremented-by-one variable to suddenly take on an outrageous value. This happens more often by errant maintenance changes or some obscure code path than by cosmic rays. If that happens, you probably want to halt the program, rather than quietly ignoring the problem and going on.

    This begs the question of whether it is worth adding extra sanity checking code before every reference to a variable. Maybe so, usually not. Some kinds of catestrophic failures are so catestrophic that you're going to blow up in short order anyway. And the extra code overhead can be a hit on performance.

    In some languages, this kind of defensive programming is done a conditionally compiled ASSERT mechanism, which can be turned off during deployment. Writing

    ASSERT(0 < $count && $count <= 10); last if $count == 10;
    covers both the case of a normal fuse burning down and the abnormal event of the fuse suddenly going out of range.

    I often do this when writing complicated logic in classes, where an object's instance variables (or Perl equivalent) can possibly be side-effected by an intervening method call. (Someone once thanked me for being defensive two years after I left a company. They'd being doing maintenace work to extend some stuff I'd done, and ASSERT code kept them out of trouble.)

    In Perl, I'll sometimes add asserts during development, then comment them out when the code is stable. E.g.,   #die "assert" unless @{$p->{'children'}} > 0; This doubles a "real" comment.

      On the subject of catastrophic failures: I've written a lot of code that's in front of a lot of people. Little things like >= instead of == have made situations "manageable" where they might not otherwise have been. The appearance of being a little sloppy in a critical situation is far better than causing a disaster.

      You can never test enough. You can never design enough. The OP gave a trivial example. Wish he'd have had something better up his sleeve.

      Lemme give some examples (yes, these have actually occured):

      1. Old-time reporting system keeps line-counts to know when to issue a form-feed character. Change made to the footer of a report by a sloppy programmer caused the == condition to miss by one and the form-feed routine goes nuts and spews paper everywhere. This is not funny on a high-speed chain-printer with access to a 50lb box of paper. Or on a check printer where each check skipped costs money.

        Fixed the bug, changed the condition to >= and when a subsequent programmer caused a similar bug we got a few pages that printed with "widow" lines -- a far, far better problem.

      2. Medical records system contained a hash-and-sequential-search feature for finding individual records. Large installations notice a (o)n time for finding records when it should have been constant time plus a small hit for the sequential search. When a doctor is trying to find your drug allergies in a hurry, this isn't funny either.

        Turns out a programmer didn't notice that certain overflow conditions (hash collisions) near the end of the file caused a sequential search to wrap to the beginning of the file again without returning the failure. Small bug fixed, and a == changed to a >= ensured that if it ever cropped up again at most one additional sector of records would be read. (And it did, sector drops from the disk caused it to skip a sector and only a >= would catch it...)

      3. Countless small C and ASM programs I've seen where a "hang" problem has been introduced by an unforseen increment and a == missing slightly. This always happens in a demo or at a customer site. (Hi Murphy!) The "hang" eventually clears itself up after you wrap around a short, int, or long. If you've got an index variable with many friends in a block of code, defensive is always good.
      I'm gonna get --'d by the purists sure, but in my head as I'm writing these things I always thing "what if this loop gets away..." and imagine the worst (sometimes it's not that bad, or not possible). Murphy's an old friend of mine. He's not to be poo-poo'd with blanket statements of "better engineering". Sometimes this is better engineering.
Re: Defensive Programming
by simon.proctor (Vicar) on Jan 14, 2002 at 21:03 UTC
    In my experience, my own particular brand of defensive programming is to defend me from myself. I can almost guarantee that most of my mistakes come at 2:00AM, after an evening of caffiene jolts and with music blaring into my headphones.

    My own list of defenses are probably very similar to most of the other monks (I have seen quite a few) so I won't go into much detail. Heres one example of the things I do:
    # Number 1: the trailing return sub blah { if($somevar == $somevalue) { return $this; } else { return $that; } return; }

    I had an argument with my boss for a while about doing this. But it does nothing, harms nooone so why not (well that was my argument :P).

    I think it is easy to confuse defensive programming with secure programming but I see them as two separate disciplines.

    Defensive programming should, in my opinion, be about how you program. Thus it allows you to affect the design of the program as well as how you implement it. After all, if an algorithm or code snippet (more likely) goes against your own brand of superstition then who would honestly not re-write it? I know that the first thing I would do with code from another person is go through it with a fine toothcomb.
      If that was really the whole construct (i.e. you've already generated $this and $that elsewhere) why not use:
      return $somevar == $somevalue ? $this : $that;
      Which is a lot easier for me to read w/o all that syntatic clutter around....

      -Blake

      You know that that return shouldn't happen, right?

      So if someone changes your function so that it can avoid any of the meaningful returns, something is probably wrong, right?

      Wouldn't you like to know about it in that situation? For that reason I sometimes do something like the above, except instead of a return I use Carp's confess and confess that I don't know how I got there. This is particularly useful I find after "endless loops" that I return out of. Should I fall out of the loop (something which I intend to be impossible), that is a programming logic error and I darned well want to catch the mistake sooner rather than later.

      Personally I think it's bad style. You now have dead code that will never execute. It's also confusing as it made me think for second and go, "hang on, the else will catch everything else".

      Dead code is a bad thing. Let's not start introducing it deliberatly...

      gav^

      But it does nothing, harms nooone so why not

      • "Hmm, have to make blah call foo somewhere. I'll just throw it in at the end, before the return."
      • "WTF?! Why isn't foo getting called?"
      • "F&@%$&*@!!"

      The extra return is misleading.

      On the other hand, this example does demonstrate another good defensive programming practice: always always ALWAYS put a default else on your conditionals, even (especially) if that condition "can't happen".

      --
      :wq
      While I'm a believer in defensive programming in many ways, I can't help but wonder if in this particular case, the cure is worse than the disease. The subroutine's caller is presumably expecting a valid response ($this or $that, say), yet the code has a fallback case where the routine simply returns something... who knows what. This strikes me as a potential debugging nightmare, because if something gets fouled up in the sub, the program might not break right away -- the sub would just return some (probably bogus) data, which could keep getting processed until it causes some bug way later in the program. Such bugs are, in my experience, extraordinarily hard to track down, because it's hard to know where they even originated.

      If I were worried about the possibility that I might forget to explicitly return a value from within one of the branches of my if statement, I'd probably code the sub more like this:

      sub blah { if($somevar == $somevalue) { return $this; } else { return $that; } die "Something very bad is happening"; }
      which is to say, I would try to scream about the problem as early in the logic chain as possible, rather than sweep it under the rug where someone else might trip over it later.
Re: Defensive Programming
by chromatic (Archbishop) on Jan 15, 2002 at 01:49 UTC
    It's costless to replace == with >=...

    I disagree. It's extra code to read, to understand, to be executed, and to maintain. It's very little in this case, but I think it's an awful reason to do something because "it might be needed in the future."

    The best way to make your code extensible is not to add in features you think you might want someday. The best way is to write simple, readable, testable, and simple (yes, doubly simple) code that does what it needs to do and no more. If you fall into the trap of thinking that you need to plan for things you can't foresee, you'll be hampered in your ability to make changes. "I can't change this code because it plans for a case I might need down the road, and I've already written it."

    I don't think of that as "defensive programming" but rather fearful, packrat programming. In my mind, defensive programming is checking error codes and boundary conditions. (You could also argue that, given my preference above, it's not adding things that don't have an immediate benefit, but that's a bit of a stretch.)

Re: Defensive Programming
by mstone (Deacon) on Jan 15, 2002 at 02:08 UTC

    > Now, why do I think of *this* as being defensive programming, whereas
    > quoting hash keys feels like a superstition?

    Because in this case you're providing logical closure.

    Programming is about translating human assumptions into symbols a computer can manipulate. The more completely you lay out your assumptions, the better your code tends to be.

    In this case, you want to process 10 items. That's the assumption you're translating into code. It's easy to complement that assumption by saying, "we don't want to process more than 10 items," and then we can break that down into the mutually exclusive statements: "we have processed less than 10 items," and "we have processed 10 items or more."

    Those statements are useful because they cover every possibility. It's safe to assume that any runtime configuration will fit into one of those two categories, and that we can Do The Right Thing as long as we know which category we're in.

    So you created a counter named $count, and used that to model the assumption "how many items we've processed." Doing so injects the new assumption, "$count always matches the number of items we've processed." There's no guarantee of that in $count itself, because $count is a scalar and can hold a wide range of values, including integers, strings, references to other variables, and floating-point values. Therefore, you have to write code to enforce the assumption, and your program is only as reliable as that code.

    You chose to use $count++ inside a loop, and to increment $count every time you process an item. The code for processing and the increment have to stay welded together, or the assumption that $count matches the number of items processed will break. Nor can you touch <t>$count anywhere else in the program, or again, the underlying assumption will break.

    Now, if you'd chosen $count == 10 as your halting condition, you'd add the assumption: "we will always be able to see the point where we leave the asumption 'we have processed less than 10 items' and move to the assumption 'we have processed 10 items or more'." Again, there's nothing in in $count that will guarantee such behavior, in fact, there's a lot that will happily break that assumption. So once again, you'd have to enforce that assumption with code.

    Sure, $count++ meets those conditions (asssuming you started with an integer value, and that $count isn't touched anywhere else in the program), but there are plenty of ways to modify the code that will break what is admittedly a very brittle assumption.

    Worst of all, propagating assumptions through your code like that creates silent dependencies that aren't actually documented by the code itself. They're things you have to remember, or that some other programmer will have to extrapolate from reading the code. If you decide to add a new way of processing items, you have to remember to increment $count as well, because those two concepts have to stay welded together. If a typo clobbers the value in $count, your program will behave in hard to predict ways because the code no longer matches an assumption that isn't explicitly stated anywhere.

    Personally, I'd do this:

    my @found = (); my $err = ''; for $c (@$Courses) { if (&is_good ($c)) { push @found, $c; } else { $err = "Problem: $c failed the test.\n"; } last if (($err) || (@found >= 10)); } for $c (@found) { print PageCourseInfo ($c); } print $err if ($err);

    because taking the length of a list of found items matches the underlying assumption better. Its size will always matches the number of items we want to process, and will always be an integer greater than zero. And you'll note that I'm still using @found >= 10 instead of @found == 10, because that does the right thing for any value of scalar @found.

Re: Defensive Programming
by FoxtrotUniform (Prior) on Jan 15, 2002 at 00:34 UTC

    One thing that I've trained myself to do, which isn't so much defensive programming in construction, but more of a habit, is to mirror resource calls right away. For instance, if I'm writing code to process a file, I'll write:

    open FILE, "<$filename" or die "Can't open $filename: $!\n"; close FILE or die "Can't close $filename: $!\n";
    and then fill in the code between the open and the close. (And yes, 3-arg open is safer, but it's not available on my target.) I do the same thing with DBI connects and the like.

    --
    :wq

      One of my defensive programming steps is to /always/ use IO::File whenever I open a file. Scoping is a wonderful way to close files, and it keeps me from having to remember puncuation variables, and having to use bareword filehandles, and ...

      Of course, if you can't 3-arg open, you probably can't use IO::File either.

      Thanks,
      James Mastros,
      Just Another Perl Scribe

Re: Defensive Programming
by YuckFoo (Abbot) on Jan 15, 2002 at 09:38 UTC
    Something I always thought was cool but never got in the habit of doing is putting constants on the left side of the == comparison. If the character goblins grab one of your =, you'll get a compile error instead of an unwanted assignment.

    if (42 = $num) { print "It's 42."; }

    YuckFo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://138576]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2022-05-24 07:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (82 votes). Check out past polls.

    Notices?