Re: Error handling - how much/where/how?

In general, it's a good idea to put your error handling as low as possible.

The higher, more abstract layers of your code should be able to trust the lower layers. You shouldn't have to check for buffer overflows, data consistency, or other such problems in the "okay, now I want to get something done" parts of your code. Unfortunately, most programmers are used to writing incomplete functions, so they don't know how to make their low-level code trustworthy.

'Completeness' is a mathematical idea. It means that the output of a function belongs in the same set as the input. Addition is complete across the positive integers: the sum of any two positive integers is also a positive integer. Subtraction, OTOH, is not complete across the positive integers: "2-5" is not a positive integer.

Broadly speaking, incomplete functions will bite you in the ass. Every time you leave something out, you have to write nanny-code somewhere else to make sure the function is actually returning a good value. That's difficult, often redundant, wasteful, and confusing.

It's a much better idea to design your code with complete functions at every level. Instead of allowing foo() to return any integer, then checking the result to make sure it falls between 0 and N, you break the problem into two parts:

Write foo() so it only returns values in the set (0..N, "invalid"). Or better still, return the structure: { 'value'=>(0..N), 'is-valid'=>(0,1) }.
Then write the client code so it Does The Right Thing for every possible output value.

Yeah, you're still writing some output-testing code in the higher-level functions, but the code itself is much simpler. Instead of having to range-check the value, you just have to check 'is-valid' as a boolean. The low-level code does all the heavy lifting on deciding whether the output can be trusted. And in many cases, you can find default error values that work just fine in the main-line higher-level code.

When you write code that way, you end up with each level carrying only the error-handling that makes logical sense at that level, and just enough error-handling to pass usable output along to the next layer up.

Comment on Re: Error handling - how much/where/how?

Replies are listed 'Best First'.

Re^2: Error handling - how much/where/how?
by Jenda (Abbot) on Jun 13, 2005 at 22:08 UTC

If I were to use a module that returns your proposed { 'value'=>(0..N), 'is-valid'=>(0,1) } the very first thing I'd make would be a

sub sanify {
 my $fun = shift;
 return sub {
  my $ret = $fun->(@_);
  if ($ret->{'is-valid') {
   return $ret->{'value'}
  } else {
   die "Well, something was incomplete. The module author gave no clue
+s!\n";
  }
 }
}
[download]

Update: <quote>In general, it's a good idea to put your error handling as low as possible.</quote>
Let me disagree. If you put the error handling too low, you end up with much longer code. And longer code takes longer to write and it means more bugs. So you should always try to find the right level to handle errors. Not too low nad not too high. I'm afraid apart from experience there is no way to tell what level is the right one. I just think you should not be afraid to say "I don't mind whether it's this or this or this operation that fails, if any of them does I want to handle the problem like this."

Jenda
XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

[reply]
[d/l]
[select]

Re^3: Error handling - how much/where/how?

by mstone (Deacon) on Jun 15, 2005 at 20:34 UTC

You're really gonna hate any() values (and traits) when Perl6 comes around. ;-)

If you want more information about what went wrong, put more values into the return set. You know.. like NaN, positive and negative infinity, "Number outside representation range" and that sort of thing. You could also add utility values like positive and negative zero, or 'Infinitesimal' which make certain calculations easier.

If you want even more information, you can choose which of:

NaN_divide_by_zero
+/-Infinity_divide_by_zero
or just plain Divide_by_zero

fits your purposes best.

You're also missing the point that value should always contain something consumable by the main-line client code. The goal is "usable but identifiably bogus" rather than "broken but correct."

If I were writing a division operator, for instance, I'd probably have division by zero return:

    { 'value'=>1, 'is-valid'=>0, 'error'=>'Divide_by_zero' }
[download]

The one in value is consumable by any other mathematical operation, even though it's totally bogus as an accurate result of the calculation. The boolean in is-valid tells you it's bogus, and the detail code in error tells you why. (Yeah, error is new. I added more information. We can do that)

If I really wanted to get spiffy, I'd add still more information:

    { ... 'trace'=>"($a/$b)" }
[download]

then write all my operators so they return progressively more complicated trace strings whenever they get invalid arguments. That way, I could see at a glance where the error occured, rather than having to fire up a debugger and step through the code until it bombs out again.

IMO, the presence of exceptions is a code smell. It says that you'd rather use the quantum superposition of two (or more) possible code sequences in a "try it, then backtrack and see what went wrong" fashion rather than figuring out how to make the code work in the first place. And it's almost always a sign that the programmer is trying to use a data representation that's too primitive to handle all the results that are actually possible.

So.. you can write Schrodinger-code to compensate for the bad decisions you made about data representation, or you can choose a data representation that actually does what it's supposed to, and handle the job correctly.

[reply]
[d/l]
[select]

Re^4: Error handling - how much/where/how?

by Jenda (Abbot) on Jun 16, 2005 at 00:43 UTC

No, I'm not gonna hate any or traits. You are missing the point. I just won't put up with a module that forces me to jump throught the hoops to get at the value. That was the main point, not the fact that the hash did not contain the error details.

If the operation failed there's no point in it returning "something consumable by the main-line client code". The main-line code should NOT consume the value at all! Which, using your style means that I have to test everywhere and then on a hundred more places whether the thing I received is indeed a value or an error. Which means that 1) the code will be much longer and 2) I will surely forget to test it on some places.

Why is your division operator returning 1? How is 1 "usable but identifiably bogus"? 1 is a totaly reasonable result of division, even if it's bogus in some cases (using your division operator) there's nothing identifiable about it.

And for your division operator to be at least barely useable you'd have to define it not (just) on numbers, but also on your "maybevalues". That is on those {value =>..., 'is-valid'=>...,...} structures? And you'd likewise need to define all the other mathematical operators. So that the user may at least (mind you I'm not asking for anything fancy) write code like $mayberesult = 1+ $x/$y (assuming the / is that your division operator. Of course that code would not be complete, after that the user would have to add something like

if ($maybevalue->{'is-valid'}) {
 ...
} else {
 report the problem somehow, maybe return another maybevalue from the 
+current procedure
}
[download]

Next thing. What's the result of {value => 1, 'is-valid'=>0,error=>'Division by zero'} / {value => 1, 'is-valid'=>0,error=>'Number outside representation range'}? Do you choose just one of the errors? Do you combine them? How? And if you keep the trace, how do you combine that? And how's the "main-line code" supposed to make any use of that then?

Exceptions let me handle the problems at the level I need, without caring at the deeper levels. Which may mean that I won't know where exactly in some computation does the div-by-zero originate from ... but most likely I don't care. I just need to handle the problem without blowing he code out of proportions with tons of maybevalue handlings. We have a job to do and it's not to write a program you can prove to be correct, but rather to write a program that does what it's supposed to and to do so in a reasonable time.

Jenda
XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

[reply]
[d/l]
[select]


Pathologically Eclectic Rubbish Lister
	PerlMonks