http://www.perlmonks.org?node_id=1019843


in reply to Re: "undef" is not NULL and what to do about it
in thread "undef" is not NULL and what to do about it

Those are both great examples and show why unkown is better than undef. First, keep in mind that undef behavior is rather ad-hoc and poorly specified and the warnings reveal that it's not designed for comparison. However, unknown is specifically designed for comparison and nothing else and it's behavior is well documented in my module.

In this case, either of your examples would be a bug if you're dealing with either undef or unknown values. However, unknown values offers safety. Let's look at your first example:

if ( $salary < $threshold ) { increase_salary( $employee, 3_000); } else { decrease_salary( $employee, 3_000); }

What might reasonably happen if we have an undef value? The salary is coerced to zero and that's probably less than the threshold, thus causing increase_salary() to be called. What happens in there? Presumably something like this:

    $employee->salary( $employee->salary + $increase );

And an employee who's salary was previously unknown now has a salary of $3,000 and with undef values, you've probably corrupted your data.

What happens if you use unknown? Well, salary does not evaluate as less than threshold, so we call decrease_salary() and probably still have a bug, right? In that function, we probably hit code like this:

    $employee->salary( $employee->salary - $decrease );

So did we corrupt our data? Nope. Remember, unknown values are designed to provide semantically correct comparisons and nothing else. What happens if you try to do something else? For the example above, you see Math cannot be performed on unknown values followed by a stack trace.

In other words, unknown values will throw an exception rather than allow your data to be corrupted.

So should you test for unknowns? Sure. The module exports an is_unknown predicate (which defaults to $_). Using that liberally will help make your code more robust. However, if you forget (and which programmer doesn't forget from time to time?), undef can corrupt your data while unknown will die rather than allowing it to be corrupted. That's a deliberate design goal.

The only case where I've violated this rule is stringification: it prints [unknown] for unknown values. However, this may have been a mistake (imagine printing this in JSON, for example) and I may revert that behavior in another release.