http://www.perlmonks.org?node_id=710640

vrk has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I would like to capture the error output of Parse::RecDescent to a variable instead of printing it to STDERR. There does not seem to be any way of doing this easily; there is no documentation for it, and viewing the source code reveals that

  1. the module dups STDERR into its own filehandle called ERROR, and
  2. the module defines a format called ERROR for printing to the filehandle.

Since the ERROR filehandle is defined at module use time, it is necessary to do STDERR munging in a BEGIN block, thusly:

my $ParseError; BEGIN { open(my $olderr, '>&', STDERR) or die "Cannot dup STDERR: $!"; close STDERR; open(STDERR, '>', \$ParseError) or die "Cannot open in-memory file +: $!"; select STDERR; $| = 1; # Cannot use, since that messes up execution order for some reason + (since use # implies a BEGIN block itself?). require Parse::RecDescent; close STDERR; open(STDERR, '>&', $olderr) or die "Cannot restore STDERR: $!"; } # Then later: my $p = Parse::RecDescent->new($grammar) or die "Invalid grammar"; my $output = $p->startrule($str) or die $ParseError;

However, fine as it is though hairy, this will not produce any content in $ParseError, which will stay undefined. If I replace the STDERR re-open with an opening of a file, say

open(STDERR, '>', '/tmp/error') or die $!;

I do get the error strings in /tmp/error!

Is the cause of the problem the use of format in Parse::RecDescent, or some other subtle things I'm missing? An issue with buffering? Or is the approach fundamentally flawed? I guess I could open an anonymous temporary file somehow, redirect STDERR to that, then read in the contents of the file, but surely there is a better way. (This is on Perl 5.8.8 on Linux, x86_64, if that's relevant.)

--
print "Just Another Perl Adept\n";

Replies are listed 'Best First'.
Re: How to grab Parse::RecDescent error output in a variable?
by ikegami (Patriarch) on Sep 12, 2008 at 10:20 UTC

    However, fine as it is though hairy, this will not produce any content in $ParseError

    open '>&' results in a dup system call, so it only works on file handles that have a file descriptor. File handles to scalars aren't real file handles, so the OS doesn't know about them, so the OS can't dup them.

    Since the ERROR filehandle is defined at module use time, it is necessary to do STDERR munging in a BEGIN block

    Actually, it occurs at module execution time. The module is only executed the first time the module is required.

    Anyway, it would be cleaner to re-open Parse::RecDescent::ERROR.

    use Parse::RecDescent qw( ); ... open(local *Parse::RecDescent::ERROR, '>', \my $error); my $p = Parse::RecDescent->new($grammar) or die "Invalid grammar"; my $output = $p->startrule($str) or die $ParseError;

    This method

    • provides better encapsulation,
    • will allow you to capture error output to a scalar since the dup is taken out of the equation, and
    • should be thread safe if you capture to a scalar (since no system file handles are involved).

    since use implies a BEGIN block itself?

    Did you not read use at that point?
    use Parse::RecDescent;
    is equivalent to
    BEGIN { require Parse::RecDescent; Parse::RecDescent->import(); }

    And yes, BEGIN blocks can be nested. A BEGIN is executed as soon as it is fully compiled, so the following prints "abcde":

    print("d"); BEGIN { print("b"); BEGIN { print("a"); } # inner BEGIN executed after this line is compiled print("c"); } # outer BEGIN executed after this line is compiled print("e"); # outermost scope executed after this line is compiled

      About the use of require in this example... Here is a complete working example that captures Parse::RecDescent output in a dup'd anonymous filehandle, which shows that use and require are not equivalent in this case.

      Anyway, it would be cleaner to re-open Parse::RecDescent::ERROR.

      Yes, it definitely would be. This is what I tried first. Here is a complete example (did you try to run your own example code?):

      use strict; use Parse::RecDescent; sub parse { my ($grammar, $str) = @_; open(local *Parse::RecDescent::ERROR, '>', \my $error) or die "Cannot open in-memory filehandle: $!"; local $::RD_ERROR = 1; local $::RD_WARN = 2; my $p = Parse::RecDescent->new($grammar) or die "Grammar is invalid"; my $x = $p->start($str); defined $x or die $error; return $x; } print parse('start: /foo/ | <error>', 'fo'), "\n";

      Result:

      $ perl recdescent3.pl Undefined format "Parse::RecDescent::ERROR" called at /usr/share/perl5 +/Parse/RecDescent.pm line 2910.

      Clearly format has a side-effect that prevents the use of the nice solution this time. Besides that, your re-open of ERROR requires knowledge of the package internals, while redirecting STDERR requires arguably less knowledge, and certainly not the name of a private (albeit package global) variable.

      --
      say "Just Another Perl Hacker";

        Regardless of what the documentation says about use being equivalent to require Module; import Module;, this example shows that it is not the case this time.

        Not true. The documentation doesn't say that. Both the documentation and I said use Module; is equivalent to

        BEGIN { require Module; import Module; }

        And that is clearly the case this time.

        Besides that, your re-open of ERROR requires knowledge of the package internals

        So does knowing STDERR is duped at execution time. Besides, the benefits far outweigh the drawbacks. Well, if it had worked.

        Benefits
        • Works with scalar handles.
        • Much cleaner code in the caller.
        • In fact, one can arrange to have no extra code in the caller since it can be placed in the "grammar".
        • Works even if use Parse::RecDescent; is executed twice.
        • Works with threads.
        Drawbacks
        • Minor reliance on stable PRD guts.
        • Doesn't work. Oops!

        Clearly format has a side-effect that prevents the use of the nice solution this time.

        Ah dang! You could issue the format on the new handle, but that's going pretty far into the inards.

        Curious; as it happens I had to do this for a project I was working on some months back, and came to the same conclusion as ikegami. My code reads thusly (copied and pasted):
        open( *Parse::RecDescent::ERROR, '>', \(my $parse_error) ) or croak("Error: unable to redirect SDTERR."); $Parse::RecDescent::skip = ' *\x{0} *'; $::RD_ERRORS++; $::RD_WARN++; $::RD_HINT++; my $parser = Parse::RecDescent->new($grammar) or die("Bad grammar! +");
        I can assure you that does actually work for me (perl 5.8.8, P::RD 1.94). My best guess is that your use of local on that open call is maybe creating problems?

        The dependency on P::RD internals always bothered me as well, but I never got around to figuring out a better way to do this.

        Cheers, Tim

        Update: Yup, turns out if you remove that local then it works:
        use strict; use Parse::RecDescent; sub parse { my ($grammar, $str) = @_; open(*Parse::RecDescent::ERROR, '>', \my $error) or die "Cannot open in-memory filehandle: $!"; local $::RD_ERROR = 1; local $::RD_WARN = 2; my $p = Parse::RecDescent->new($grammar) or die "Grammar is invalid"; my $x = $p->start($str); defined $x or die "CAPTURED: $error"; return $x; } print parse('start: /foo/ | <error>', 'fo'), "\n";
Re: How to grab Parse::RecDescent error output in a variable?
by vrk (Chaplain) on Sep 12, 2008 at 08:44 UTC

    Answering my own question, here is one way to do it:

    my $ParseErrorFh; BEGIN { open(my $olderr, '>&', STDERR) or die "Cannot dup STDERR: $!"; close STDERR; open(STDERR, '+>', undef) or die "Cannot open anonymous file: $!"; select STDERR; $| = 1; open($ParseErrorFh, '>&STDERR') or die "Cannot dup anoynomous file +: $!"; # Cannot use, since that messes up execution order for some reason + (since use # implies a BEGIN block itself?). require Parse::RecDescent; close STDERR; open(STDERR, '>&', $olderr) or die "Cannot restore STDERR: $!"; } sub report_error { seek($ParseErrorFh, 0, 0); die join '', grep { $_ !~ m/^\s*$/ } <$ParseErrorFh>; } sub parse { my ($grammar, $str) = @_; seek($ParseErrorFh, 0, 0); # To ensure it's not growing indefinite +ly, as it is package global. my $p = Parse::RecDescent->new($grammar) or report_error(); seek($ParseErrorFh, 0, 0); return $p->startrule($str) or report_error(); }

    Of course, this is not thread-safe, and it is also quite ugly. Other solutions?

    --
    say "Just Another Perl Hacker";