RFC: User subroutine hinting interface for autodie

G'day learned monks,

Not that long ago, a bug against the autodie pragma (which makes subroutines and functions succeed or die) was reported here on PerlMonks.

The bug all comes down to context, and how subroutines report failure, and is best shown with an example:

use File::Copy qw(copy);
use autodie qw(copy);

copy($foo, $bar);         # Void context, succeeds or dies.

my $x = copy($foo, $bar); # Scalar context, succeeds or dies.

my @x = copy($foo, $bar); # Array context, succeeds or fails silently.
[download]

Our problem is that the copy subroutine always uses return 0; to indicate failure. In a scalar context this is false, and autodie assumes it means an error. However in a list context it returns (0) (a list of a single zero), which looks like something which may be a legitimate value, and so autodie passes it through fine. By default, autodie only thinks a failure has occured if it sees an empty-list, or a list consisting of a single undef.

As this example shows, autodie can fail to detect failure when it occurs in unexpected ways. The solution is to provide a hinting mechanism, where not only can a subroutine be made autodying, but hints can be provided to control to describe what it considers to be a failure.

To keep the autodie interface clean, and to ensure nobody has to repeat themselves, these hints are provided out-of-band. So when you see a piece of code like:

use File::Copy qw(copy);
use autodie qw(copy);
[download]

autodie will look up the hints table for File::Copy::copy and check to see which conditions indicate failure. A table of these hints for common modules will be included in the next autodie release, but they can also be supplimented by your own code (eg: use my::autodie::hints), or even built into exporting modules themselves. This all works right now on the hints branch of the source code repository.

Most of the time, end-users will never have to worry about the hinting interface, it's only something that myself, or module developers, or very eager people will be using. Having said that, I want to make sure I get it right.

Right now, the current hinting interface sucks. You have to do something like:

use Some::Module qw(foo);
use autodie::hints qw( LIST_EMPTY_ONLY SCALAR_UNDEF_ONLY );

autodie::hints->set_hints_for(
    \&foo,
    LIST_EMPTY_OR_FALSE | SCALAR_UNDEF_ONLY
);
[download]

Note that we're OR'ing bits together manually to set the hints. That was due to an old idea that came to me on a coffee-deprived tram ride, and which didn't work out. Note that we're also including some big ugly constants just to use them in a single call. I don't like that at all.

So, I'd like to replace the interface. Currently the plan is to have something like this:

use Some::Module qw(foo);
use autodie::hints;

autodie::hints->set_hints_for(
    \&foo,
    qw( LIST_EMPTY_OR_FALSE   SCALAR_UNDEF_ONLY )
);
[download]

Here we're passing in a list of strings as hints, which makes error messages nicer (you typed X, did you mean Y?), and avoids having to screw around with bitwise operations. Settings currently have the first word being the context they apply to (scalar/list), and then we list what are considered 'failure' values. The current list are as follows:

SCALAR_ANY_FALSE (default)
SCALAR_UNDEF_ONLY
LIST_EMPTY_OR_UNDEF (default)
LIST_EMPTY_ONLY
LIST_EMPTY_OR_FALSE

Note that there are some things which are missing from the list, my goal is to have autodie work with the most common legacy ways of signalling errors, not every possible wacky scheme imaginable. Also note that the hinting mechanism will never be used for Perl's built-in functions, autodie is already aware of their special cases, and no further user intervention is necessary.

So, my question to you, dear monks, is can I do this better? Can the hints be given better names? Is my set_hints_for method particularly unintuitive? How would you expect this to work?

The autodie module is going into the 5.10.1 release of Perl, and as such if I screw things up, it's likely they'll stay screwed for a long time. Any comments, feedback, or questions are appreciated.

Many thanks,

Paul Fenwick
Perl Training Australia

Comment on RFC: User subroutine hinting interface for autodie Select or Download Code

Replies are listed 'Best First'.
Re: RFC: User subroutine hinting interface for autodie by chromatic (Archbishop) on Mar 04, 2009 at 08:36 UTC
Did you consider the use of subroutine attributes? One drawback is that the creator of the function has to set them. One benefit is that the creator of the function can set them.	[reply]
Re^2: RFC: User subroutine hinting interface for autodie by pjf (Curate) on Mar 04, 2009 at 10:14 UTC
To me, the autodie pragma is about fixing the past. Having to check subroutines for errors is a tiresome process, and inattention to detail rapidly results in bugs. Using `autodie` gives me a way for subroutines to work the way I feel they should: by throwing exceptions on failure. It's easily applied to existing code in lexically sized chunks, so the past can be fixed one block at a time. However in order for `autodie` to be able to fix the past, it can't depend upon it to change. I want to be able to `use autodie` and not have to worry about if my system has a new or old version of `File::Copy`, or `File::Compare`, or `DarkPAN::BallOfMud`. Even if I could change all those modules, some of them are only in the core. Even if the core releases every three months, it will still take too long for them to reach my clients and my code. I could provide an attributes interface. It would even look quite elegant, but I'm not sure it can be used for good. If new code is being written, I feel it shouldn't try and cling to the old ideas of returning funny (and easily ignored) values on failure; it should be throwing proper exceptions. An attributes interface helps support the old and crusty ways when writing new code, but doesn't help fix the old and crusty code that's already out there. I'm very happy with the idea of a separate module that reads subroutine attributes and sets hints appropriately. It should be quite easy to write. However I don't believe that it belongs in the core `autodie` distribution (and hence ultimately in 5.10.1). All the best, Paul Fenwick Perl Training Australia	[reply]
Re: RFC: User subroutine hinting interface for autodie (terse, unhidden) by tye (Sage) on Mar 05, 2009 at 07:53 UTC
I don't like the current hinting system nor your proposed improvement that hides the information and relies on a cabal to pre-compute the hints so that the unwashed masses never have to see it, know about it, or use it. I think such hidden information has a high probability to be incomplete and inconsistent such that users become very surprised why something works fine on one system and looks identical and doesn't work on some other system. But I think you can greatly simplify the hints to the point that making them visible is not a burden. I would define the following defaults: In a scalar context, undef means failure. In a list context, empty list means failure. And I would define the following options: This function always returns just one scalar, even when used in a list context (represented by `$`, the sigil for scalar values in Perl 5) Empty string (including undef) means failure (represented by `"`, not only unambiguously "string"y in Perl but also visually similar to `''`, the empty string) 'False' means failure (represented by `!`, logical negation) Then you can quite concisely represent any meaningful combination of those options: `use autodie qw( ReadLine Copy!$ NextKeyword! GetParam$ GetCount"$ Next +Word" );` [download] Where ReadLine() returns an empty list on failure, undef in a scalar context. Copy() always returns a scalar which is simply false in the case of failure. A successful NextKeyword() returns either a non-empty list or (in scalar context) a string that is never `''` nor `'0'`. GetParam() does `return undef;` on failure. GetCount() returns an empty string for failure. Finally, a successful NextLabel() returns either a non-empty list or (in scalar context) a string that is not `''`. As for pre-computing hints, I would require that the user request the pre-computed hint so that, if the current system lacks that particular hint, the user can be made aware of the problem. I'd probably use a '?' to indicate that the user doesn't claim to know the answer and asks that the module author provide it. - tye	[reply] [d/l] [select]
Re^2: RFC: User subroutine hinting interface for autodie (terse, unhidden) by pjf (Curate) on Mar 05, 2009 at 12:27 UTC
Tye, this is excellent, and is exactly the sort of discussion and thoughts I was after. I really appreciate the feedback, even though I'm about to argue against most of it. ;) I agree that having secret hints is going to result in some bad surprises. Those surprises are going to especially bad when a production system lags behind the development system, and an otherwise correct piece of code suddenly and silently becomes incorrect. That's a really great argument for getting hints out in the open. However I don't think that users providing their own hints is a good idea. For starters, it means that users need to know much more about the subroutines they're using than they should. I shouldn't need to care about how `File::Copy` signals failure if I've delegated that task to `autodie`. I certainly don't want to provide hints every time I invoke the pragma, since that gives me extra chances to get them wrong. As a lazy consumer, what I do need to know is that when I've delegated something to `autodie` that it's up to the task. In this sense, I like your idea of using '?' or some other marker to indicate that hints should exist for a given subroutine; I can trigger a compile-time error if they don't. The other thing I can do is to have autodie insist that all autodying user subroutines have hints, and introduce a marker to indicate that a subroutine is allowed to use the defaults and be hint-free. I really like this, since it makes sure there are no surprises, and it keeps the common syntax both clean and safe. Unfortunately, it also breaks backwards compatibility with existing uses of the module. That's bad. In any case, a way of signalling that we must know hints for a user-sub is essential for any interface that `autodie` makes public. Thanks again for the input, it's hugely appreciated! Paul Fenwick Perl Training Australia	[reply]
Re: RFC: User subroutine hinting interface for autodie by Limbic~Region (Chancellor) on Mar 04, 2009 at 18:00 UTC
pjf, So, my question to you, dear monks, is can I do this better? Can the hints be given better names? Is my set_hints_for method particularly unintuitive? How would you expect this to work? I doubt I will ever use autodie so take that into consideration in reading the rest of my response. `use Some::Module qw(foo); use autodie::hints; autodie::hints->set_hints_for( \&foo, qw( LIST_EMPTY_OR_FALSE SCALAR_UNDEF_ONLY ) );` [download] What exactly am I setting? Is there an implied "The following list are failure indications"? In other words, would a more intuitive interface be `sub => \&foo, fail => qw//`? You mentioned the issue was context - have you considered other contexts (Want and/or Contextual::Return) - lvalue subs could be interesting? What about letting the user define a new hint. Let's say I have a function that returns a SQL code. Some of these codes are errors and some of them aren't but only a lookup table will allow me to define this. Perhaps someone wants to duplicate system and have anything other than 0 indicate failure. Cheers - L~R	[reply] [d/l] [select]
Re^2: RFC: User subroutine hinting interface for autodie by pjf (Curate) on Mar 05, 2009 at 12:54 UTC
What exactly am I setting? Is there an implied "The following list are failure indications"? In other words, would a more intuitive interface be `sub => \&foo, fail => qw//`? My apologies for the lack of clarity in my original post. This is indeed an implied "here are a list of failure indications", as autodie's job is to throw exceptions automatically on failure. [What about] Want or Contextual::Return? I can say straight up that subroutines that use `Want` or `Contextual::Return` may cause headaches in combination with `autodie`. Those modules do very clever things when it comes to examining context, and autodie's intercept-and-inspect code may result in subtle changes. Autodie can't leverage their cleverness in any useful way, since (with one exception) it promises not to muck with a subroutine's return value(s). What about letting the user define a new hint. Let's say I have a function that returns a SQL code. Some of these codes are errors and some of them aren't but only a lookup table will allow me to define this. Yesterday, my response would be that this is beyond the scope of what `autodie` is intended to do, which is fundamentally remove the need to write `or die...` after far too many subroutine calls. Today, after receiving a wonderfully in-depth e-mail from TheDamian, I'm fairly convinced that people will still try and use `autodie` for situations like you've just described, which has had me do a lot more thinking about what hints are applicable. While I don't have everything in concrete yet, you can expect the final hints interface to allow one to pass a subroutine reference that can inspect the return and indicate if an exception should be thrown. That covers the situation of checking SQL return values in a table, or having subroutines that only fail if it's raining. Many thanks again for the feedback and thoughts, Paul Fenwick Perl Training Australia	[reply]
Re: RFC: User subroutine hinting interface for autodie by ambrus (Abbot) on Mar 05, 2009 at 07:20 UTC
Actually I think named constants are better. Strict provides a useful diagnostic for you if you mistype any of them, so the module doesn't have to do anything. And if you don't like bitwise or, you can make your method accept a list, like `autodie::hints->set_hints_for( \&foo, LIST_EMPTY_OR_FALSE, SCALAR_UNDEF_ONLY );` [download]	[reply] [d/l]
Re: RFC: User subroutine hinting interface for autodie by lodin (Hermit) on Mar 07, 2009 at 17:50 UTC
I like the subroutine idea for specifying hints. `@_` could be the return values, and `$_` aliased to `$_[0]`. Perhaps something like `autodie::hints->set_hints_for( \&foo, scalar => sub { ! $_ }, list => sub { ! @_ or @_ == 1 and ! defined }, ); autodie::hints->set_hints_for( \&copy, any => sub { ! $_ }, ); autodie::hints->set_hints_for( 'open', any => sub { ! defined }, );` [download] would work. This makes it easy to define common behaviours and to compose them in an arbitrary way. `use autodie::hints qw/ DEFAULT_SCALAR DEFAULT_LIST /; autodie::hints->set_hints_for( \&foo, scalar => DEFAULT_SCALAR, list => DEFAULT_LIST, ); use autodie::hints qw/ COND1 COND2 COND3 /; autodie::hints->set_hints_for( \&foo, any => sub { &COND1 and &COND2 or &COND3 }, );` [download] If access to e.g. the arguments to the function is needed (for e.g. `unlink`) then that would have to be provided some other way, for instance via a localized global variable in `autodie::hints`. Just an idea ... lodin	[reply] [d/l] [select]
Re: RFC: User subroutine hinting interface for autodie by Herkum (Parson) on Mar 04, 2009 at 19:53 UTC
Why not use Contextual::Return so that you control what to return based upon whether it is being used a LIST or SCALAR context?	[reply]
Re^2: RFC: User subroutine hinting interface for autodie by pjf (Curate) on Mar 05, 2009 at 13:11 UTC
autodie doesn't change what a subroutine returns, it only causes subroutines to throw an exception if that subroutine would have returned in failure. The exception to this is autodying system, which returns the exit value, rather than `$?`. If the command doesn't get around to exiting (because it's killed by a signal, or failure to start), then it doesn't return, it throws an exception instead. Paul Fenwick Perl Training Australia	[reply]
Re: RFC: User subroutine hinting interface for autodie by Anonymous Monk on Jul 02, 2009 at 09:02 UTC
You know what would be nice? A built in log/trace option, something like #!/usr/bin/perl -- use strict; use warnings; use autodie 2.01; use autodie 'log'; # default logger use autodie log => sub { # custom logger no warnings 'uninitialized'; use POSIX(); use Carp(); use Scalar::Quote(); my ( $func, @args ) = @_; (@_) = ( POSIX::strftime( '%Y-%m-%d %H:%M:%S ', localtime ), "$func( ", join( ', ', map { Scalar::Quote::quote($_) } @args +), " )" ); goto &Carp::carp; }; open my($in), '<', __FILE__; close $in; system $^X, qw[ -le print(66) ]; systemx $^X, qw[ -le die(66) ]; __END__ 2009-07-02 01:45:45 main::open( undef, '<', 'test.pl' ) at test.pl lin +e 23 2009-07-02 01:45:45 main::close( 'GLOB(0x182f970)' ) at test.pl line 2 +4 2009-07-02 01:45:45 main::system( "C:\\Perl\\bin\\perl.exe", '-le', 'p +rint(66)' ) at test.pl line 25 2009-07-02 01:45:45 main::system( "C:\\Perl\\bin\\perl.exe", '-le', 'd +ie(66)' ) at test.pl line 26 66 at -e line 1. "C:\Perl\bin\perl.exe" unexpectedly returned exit value 255 at test.pl + line 26 [download] For now I'm getting by with `use Devel::TraceMethods ( main => sub ... );` but would really love `use autodie 'log';` Read more... t/hints.t failure (1522 Bytes)	[reply] [d/l] [select]
Re^2: RFC: User subroutine hinting interface for autodie by Anonymous Monk on Jul 02, 2009 at 14:17 UTC
Devel::TraceMethods wraps too many calls, so trying to override with `use Devel::TraceCalls { Subs => [ qw! system ! ] };` [download] fails with `Subroutine main::system not defined CHECK failed--call queue aborted.` [download] Something to do with CHECK blocks. This `use Devel::TraceCalls { Package => 'main'};` works, but its wraps too much. Managed to get it working with Sub::Prepend `BEGIN{ use Sub::Prepend 'prepend'; BEGIN { for my $name( qw[ system ] ){ prepend "$name" => sub{_logzy($name,@_)}; } } }` [download] having this built into autodie would be so much easier for novices :)	[reply] [d/l] [select]