I know it is mentioned in Perl Best Practices. And I explained why it
is the opposite of a "best practice" for some functions, but you appear to
have completely ignored that explanation or completely not understood it. Note
that return; and return(); are functionally identical. They are
just two different ways of returning an empty list. I explained briefly above
why there are cases1 where returning an empty list (no
matter what syntax you use to do so) is not a good idea (that is, it is the
opposite of a "best practice").
1 And let me repeat that I agree with this idea of "returning an
empty list to indicate failure" for functions that already return lists or
(to be clearer for those who can't understand the concept of "return a list"
vs "return a scalar") for functions that at least sometimes return a
list of more than one scalar. So my argument is only that this "best practice"
needs an additional condition applied to it (as well as that without this
condition it becomes the opposite of a "best practice" in some cases).
I have not read Perl Best Practices, but I have had this particular
point explained repeatedly to me by people who have. So there is a possibility
that these people have in some way misrepresented how this is presented in
that book (perhaps a "converts make the worst zealots" effect), but I don't
think so. So, my counter argument is based on second-hand presentations of
this idea; so please forgive me and let me know if my following argument
misses the mark of what the book presents (in which case, note that my argument
should still be applied to these second-hand re-tellings of the idea).
If we believe this particular claim from Perl Best Practices, then
Larry got the design of -s wrong because -s for a non-existant
file returns undef not an empty list. And I personally hope that
TheDamian (the author) and chromatic (an obviously staunch supporter
of this claim, as evidenced elsewhere) haven't convinced the rest of "the
Larrys" that this should change in Perl 6.
So, in hopes of helping some previous readers who couldn't deal with a rather
small amount of abstraction, let's define the following very concrete
subroutine:2
sub getFileSize
{
my( $fileName )= @_;
return -s $fileName;
}
2 Perhaps a few of you can't imagine why anyone would prefer to use
getFileSize($fileName) over -s $fileName, so let me try to put
that worry out of your mind. Although I can see some real advantages
in some situations, that is not at all the point of this exercise. This is
just a very concrete example of a function that returns a single, defined
scalar value and where the concept of "failure" applies and where it makes
sense for "failure" to be indicated by undef (at least in a scalar context).
Surely we can all agree that such functions are not terribly uncommon in the
universe of Perl, especially since -s is exactly such a function and is
even just one of many related such functions. I'd normally just use
getFoo($blah), but that previously caused problems for (at least) one
person who had great difficulty evaluating the appropriateness / likelihood of
different sample uses of such a (slightly abstract) function.
Now, this "best practice" would require us to rewrite that function more like
so:
sub getFileSize
{
my( $fileName )= @_;
my $size= -s $fileName;
if( ! defined $size ) {
return; # Return an empty list to indication "failure"
}
return $size;
}
So let's look at how such a function is likely to be used and see why
I believe that the risks (of bugs) caused by returning an empty list
are much greater than the risks (of bugs) caused by returning undef for
"failure" for this type of function.
First, the case for "return an empty list". The following code is about
the only type of example I can think of where this "false in a list
context"3 situation is likely to crop up.
3 "false in a list context" is a mis-statement that I've heard
repeated more than once so I am guessing it is actually used in Perl Best
Practices. To be false requires a "Boolean context" which is a case of
scalar context which is not list context so "false in list context" must
really mean using the function in a list context that is part of some larger
expression and that larger expression is being used in "Boolean context".
I think that, by far, the most common case of such a construct is "list
assignment in a Boolean context".
my $fileName= getFileName();
my @size;
if( @size= getFileSize($fileName) ) {
print "The file, $fileName, has a size of @size bytes.\n";
} else {
print "Can't determine file size for $fileName: $!\n";
}
We are mostly interested in the if line. Does that look like code
you are likely to see? Other than allowing for "I see a lot of stupid code",
that doesn't look like code I'd expect to see in well-written code. It just
doesn't make much sense to want to assign a single scalar to an array
and to then test that assignment statement for "truth".
Granted, such code is certainly possible to write. My point is only
that I find such code very unlikely. I also find such code awkward.
So, if somebody has some better examples of better code that demonstrates
this type of conflict (so called "false in a list context" for a function
that, when it doesn't "fail", only ever returns a single scalar), I'd like
to see that. But I doubt it will be as "likely" as the sum of all of the
cases of the opposite problem, some examples of which I will now present.
Perhaps the classic example that I've seen many times4 is:
4 And that made me wish that => could have been made to
force scalar context on both of its sides, similar to how it eventually got
the ability to encourage "string context" on its left side.
my %sizes= (
$inputFile => getFileSize($inputFile),
$controlFile => getFileSize($controlFile),
);
or perhaps:
my %object= (
InputFile => $inputFileName,
InputSize => getFileSize($inputFileName),
ControlFile => $controlFileName,
ControlSize => getFileSize($controlFileName),
);
return bless \%object;
If we follow the book's proposed "best practice" and if neither our input file
nor our control file exist yet, then those code samples end up functioning like the
following, respectively:
my %sizes= (
$inputFile => $controlFile,
);
# and
my %object= (
InputFile => $inputFileName,
InputSize => "ControlFile",
$controlFileName => "ControlSize",
);
return bless \%object;
Some of our keys have been interpretted as values! That is surely a bug. A bug
caused by the proposed "best practice". And for code that I find quite natural
and in a style that I often see.
Some would make a (perhaps circular) argument that this is exactly why such
code should make use of scalar. But the point of getFileSize() is to just
get a single scalar so it doesn't make much sense to say "no I don't want a
list of one file size scalar, I just want a scalar version of that list".
It doesn't make sense to change a function from
returns a scalar (that may be undefined)
to
returns a scalar, unless there was a failure in which case it returns an empty
list
which then requires
we use scalar getFileSize() because we want just the file size, unless
it fails, in which case we want the scalar version of the empty list, also
known as undef
How can all of that added mental complexity possibly lead to better code
quality? If you buy that approach then you must surely write:
my %sizes= (
$inputFile => scalar -s $inputFile,
$controlFile => scalar -s $controlFile,
);
But I don't think I've ever seen anybody do that.
Now let's rewrite our first code example a different way to demonstrate
another pitfall:
my @files= getFileNames();
my %sizes= map { $_ => getFileSize($_) } @files;
# or
my %sizes;
@sizes{@files}= map getFileSize($_), @files;
Those both look like reasonable, likely code, and they both break horribly and
differently if we follow the "return empty list" best practice. And they
don't look like code where I'd expect to see people using scalar. Another
example:
my @files= getFileNames();
my @sizes;
for my $fileName ( @files ) {
push @sizes, getFileSize( $fileName );
}
That example is a little harder to justify. But keeping parallel arrays is
certainly something that I've done and that I've seen done (and there can
be good reasons for using that technique). And I can see someone avoiding
map for several reasons. But I'm mostly just looking at the push
statement. That seems very reasonable and likely code and something that
is likely not what was meant if an empty list indicates failure.
Another example.
sub missingIsSmallest
{
my( $x, $y )= @_;
if( ! defined $x ) {
if( ! defined $y ) {
return 0;
} else {
return -1;
}
} elsif( ! defined $x ) {
return 1;
} else {
return $x <=> $y;
}
}
my @files= sort {
missingIsSmallest( getFileSize($a), getFileSize($b) )
} getFileNames();
This fails in subtle ways given the "best practice". And this failure applies
to just about any use of a simple scalar-returning function as an argument
to some other function. Surely people agree that it is quite common to
pass an argument to a function that is an expression involving a simple
call to another function. Whether the outer function takes positional
arguments or "named" arguments doesn't even matter. Both cases are messed
up by the "best practice":
reportFile( $file, getFileSize($file), "Financials" );
# or
reportFile(
FileName => $file,
FileSize => getFileSize($file),
Description => "Financials",
);
So I find the pattern of "function returns a simple scalar except
to report failure" to be much, much more likely to cause
bugs than the pattern "function always returns a simple scalar (which
is undef in the case of failure)". I understand the mis-stated point of
"false in a list context", but it just looks like a problem that is quite
unlikely for somebody to trip over.
I certainly recall more cases of people posting at PerlMonks because they
got bit by expecting a single scalar from something that sometimes returns
an empty list than I recall people posting because they got "true" because
they assigned to an array/list in Boolean context the results of a function
call that is normally used to fetch one scalar value.
|