Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Write "Dumber" Functions

by rob_au (Abbot)
on Mar 30, 2002 at 12:28 UTC ( #155421=perlmeditation: print w/replies, xml ) Need Help??

While working on some business code involving a number of modules from CPAN, I came across a quandary with regard to the behaviour of functions from one of the more well-known and widely used of CPAN modules. The role of the function in question was to validate an argument and return a value based upon the result of the validation tests - As such, as the unwitting user of this module and function I may be forgiven for expecting that the function would simply return a true or false value based upon the passed arguments.

The guilty module? Date::Calc ... This is where, as the classics say, the plot thickens ...

As a result of user interaction (from the command line arguments passed to the script), the argument which is passed to this validation function can quite correctly be null; At this point, I should clarify that the passed argument in this scenario is defined but null, not undefined - For an explanation of this definition, have a look at What is truth? (Curiosity corner) by robin. The expected result from this validation function in this scenario is false, that is, the passed argument does not pass validation tests. The result however was a die statement with suggested function usage and arguments.

The code which caused the problem ...

use Date::Calc qw/ check_date Localtime /; my $date = shift @ARGV; my @date = split '-', $date; unless (check_date(@date)) { carp( 'Invalid date supplied from command line - Overriding with c +urrent date' ); @date = (Localtime)[0..2]; $date = join '-', @date; }

The aim of the code above is to assign the variables $date and @date with either the current date or that supplied by the user from the command line (and to warn the user of the overriding of the supplied date value when this occurs). Working defensively, the check_date method of Date::Calc is called for the validation of the user supplied date string and this is where the problem arises. It appears that calling the check_date method with anything other than three arguments, such as the case when the @date array above is empty, the module dies with the suggested function usage and arguments.

Usage: Date::Calc::check_date(year,month,day) at -e line 17

Is this rational behaviour to expect from the check_date method from the Date::Calc module? Well, to my mind, yes and no - While, suggested function usage and arguments are welcomed and appreciated whenever and wherever documented, I was less than enthralled when I found my script dying when no command line arguments were supplied. To my mind, the expected result from the validation function when no command line arguments were supplied was false, that is, an invalid date was supplied. Is this unreasonable? Perhaps, but when passing such arguments to a validation function, what would you expect to be returned?

The content of this meditation is to call for a smarter approach to the writing of validation functions and alike such that a defensive approach is taken and only expected and documented values are returned from your code. From this experience, I certainly know that I will be revising some of my validation functions and modules.

In short, please make the functions in your code "dumber" ... but be smart about doing it.



The following code represents the revised code which takes this behaviour of Date::Calc into consideration and supplies a "dummy" and invalid date when none is supplied from the command line:

use Date::Calc qw/ check_date Localtime /; my $date = shift @ARGV; my @date = split '-', ( $date || '2002-02-31' ); # Naturally, this dat +e is invalid unless (check_date(@date)) { carp( 'Invalid date supplied from command line - Overriding with c +urrent date' ) if $date; @date = (Localtime)[0..2]; $date = join '-', @date; }

Your comments and experiences are welcomed.


Replies are listed 'Best First'.
(tye)Re: Write "Dumber" Functions
by tye (Sage) on Mar 30, 2002 at 20:21 UTC

    I strongly disagree. There is a difference between a programming error and a data validation failure and I want to minimize the odds of a programming error being interpretted as a data validation failure.

    The function is documented as telling you whether three numbers correspond to a valid date. If you give it something other than 3 values, the correct reponse is neither "true" nor "false". That is considered a programming error (failure to pass in the information required) and not a data validation failure.

    You can argue that what exactly is a usage error and how such are treated should be documented. I won't argue too strongly against that, but I do see value in not over specifying such things as avoiding that means that you have some room to improve the interface in the future.

    Since you broke the contract defined in the documentation, you should not expect the function to stick to the contract in the documentation either. In some respects, it is free to do anything it wants to. It can emit warnings, errors, and/or kill the program. It can even return "true" (hopefully after emitting at least a warning). It is best if it simply reports the usage error so that you can fix your program mistake (which is what it did) and stops the program so that it doesn't continue on to do the wrong thing.

    The function is documented as validating whether three numbers form a proper date. You seem to think that it should do more than it is documented to do and also validate for you whether you have more or fewer than 3 numbers. You want the function to be "smarter" not dumber. You want it to do more, because you didn't want to do the work of verifying that the user had given you three numbers.

    Now, I do agree that sometimes it is a fine line between programming error and data validation failure, especially in Perl. You could certainly make the case that having more or fewer than 3 numbers should constitute a data validation failure here. But, if you do that, then you'd have to update the documentation to the function so that it says something more like "this function takes a list of numbers and returns a true value if the list contains exactly three numbers and ...". And I like that interface definition less than the original one.

    Perl is quite nice in that is does a good job of supporting two quite different forms of programming: quick hacks and 'developed' software.

    Sure, for some quick hacks, it might be convenient for the function to simply return "false" in the case of such invalid usage. And the rest of your code does look like "quick hack" code in some respects. You are shifting values out of @ARGV w/o knowing that there is anything to get. Your code would generate warning is you had such enabled.

    But code from modules is usually geared more toward 'developed' software. And I'm glad of that.

    I think this problem could have served as a "wake up" call and gotten you to look at your own code and your own assumptions more closely. Instead, you chose to look at the module and consider it at fault for not supporting this particular usage. But I find fault with the basic approach that your code takes, and I am glad that the module does not support it.

    If I enter an invalid date on the command line, then the last thing I want to happen is for the program to use some other date that it chooses and continue on. In many cases that is a recipe for major problems. In the best case, it is a recipe for frustration.

    I certainly agree that it is nice to have "defaults". And it might be nice for this particular program to default to "today" if no date is given. But you'd code that much differently than you have.

    use Date::Calc qw/ check_date Localtime /; my @date; if( 0 == @ARGV ) { warn( 'Using current date.\n' ); @data= (Localtime)[0..2]; } els... ... croak( 'Invalid date supplied from command line' ); ...
    There are cases where you want your program to trudge on in the face of adversity and try to do something with its best guess as to how to overcome minor problems. Otherwise you can have programs that are too fragile and refuse to work too often. But I don't think this is such as case at all.

            - tye (but my friends call me "Tye")
Re: Write "Dumber" Functions
by danger (Priest) on Mar 30, 2002 at 19:14 UTC

    Your revised code can still easily die if a user enters a partial date or mistypes an underscore for a dash, consider such faulty c-l arguments as: '2002-03' or '2002-03+1'. You may want to consider just trapping the exception with an eval-block (as Juerd suggests):

    #!/usr/bin/perl -w use strict; use Carp; use Date::Calc qw/check_date Localtime/; my $date = shift @ARGV || ''; my @date = split '-',$date; unless (eval{check_date(@date)}){ carp 'Invalid date supplied on command line - using current date +'; @date = (Localtime)[0..2]; } $date = join '-', @date; print "$date\n"; __END__

    Rebuilding the date-string after the conditional block avoids problems that might occur later in your code if an argument like '2002-03-03-' was used (it would pass as a valid date, and there would only be three elements in @date, but the $date string would still had the trailing dash).

    What I expect is for the documentation to explicitly state what will happen. Although if you just look up the docs on the check_date() routine it says it returns either true or false, earlier in the docs (under "Important Notes") it is stated that Date::Calc functions will usually die if the input values, intermediate results, or output values are of range. But then it says check_date() (and a few others) handle errors differently, returning false for invalid dates. This seems more than ambiguous about how it handles invalid parameters (number or type), and I would consider this a documentation bug. Perhaps you should contact the author with suggestions on clarifying the docs.

    Personally, I generally throw an exception on invalid parameters, but in the case of a validation routine, I would usually opt to return 0 for invalidation and undef for incorrect parameters (number or type), allowing the user to just test for failure if that's all they want (as in your case).

Re: Write "Dumber" Functions
by Juerd (Abbot) on Mar 30, 2002 at 18:41 UTC

    It appears that calling the check_date method with anything other than three arguments, such as the case when the @date array above is empty, the module dies with the suggested function usage and arguments.

    Fortunately, you can catch the error by using eval (remember: eval STRING is slow (and evil), eval BLOCK is not.). If check_date succeeds, it will still return true. (Eval returns the whatever its string or block returns - which is the last evaluated value if there's no explicit return)

    unless (eval { check_date @date }) {


Re: Write "Dumber" Functions
by FoxtrotUniform (Prior) on Apr 01, 2002 at 17:04 UTC

    Who was it that suggested the "Principle of Least Surprise" for component behaviour? I'd say this is a rather gross violation of that rule. Sure, it's great to tell the programmer that they may have called the function improperly, but to die on ill-formed input, especially in a function that's supposed to check for validity, is extreme. (By all means carp about strange input -- bonus points if you provide a flag for debugging output, so that end users don't have to wade through a huge cascade of "hey, programmer, I think you've messed up" messages from a smart-ass module.)

    As an aside, I can't stand the name 'check_date' for that function. Okay, it checks the date, but what does it do after checking the date? Die with an error? Return true ("yes, it's valid")? Return false ("yes, it's bogus")? Set a flag somewhere? How about is_valid_date instead?


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://155421]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2022-09-26 16:32 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (117 votes). Check out past polls.