w-ber has asked for the
wisdom of the Perl Monks concerning the following question:
Dear monks,
I would like to capture the error output of Parse::RecDescent to a variable instead of printing it to STDERR. There does not seem to be any way of doing this easily; there is no documentation for it, and viewing the source code reveals that
- the module dups STDERR into its own filehandle called ERROR, and
- the module defines a format called ERROR for printing to the filehandle.
Since the ERROR filehandle is defined at module use time, it is necessary to do STDERR munging in a BEGIN block, thusly:
my $ParseError;
BEGIN {
open(my $olderr, '>&', STDERR) or die "Cannot dup STDERR: $!";
close STDERR;
open(STDERR, '>', \$ParseError) or die "Cannot open in-memory file
+: $!";
select STDERR; $| = 1;
# Cannot use, since that messes up execution order for some reason
+ (since use
# implies a BEGIN block itself?).
require Parse::RecDescent;
close STDERR;
open(STDERR, '>&', $olderr) or die "Cannot restore STDERR: $!";
}
# Then later:
my $p = Parse::RecDescent->new($grammar) or die "Invalid grammar";
my $output = $p->startrule($str) or die $ParseError;
However, fine as it is though hairy, this will not produce any content in $ParseError, which will stay undefined. If I replace the STDERR re-open with an opening of a file, say
open(STDERR, '>', '/tmp/error') or die $!;
I do get the error strings in /tmp/error!
Is the cause of the problem the use of format in Parse::RecDescent, or some other subtle things I'm missing? An issue with buffering? Or is the approach fundamentally flawed? I guess I could open an anonymous temporary file somehow, redirect STDERR to that, then read in the contents of the file, but surely there is a better way. (This is on Perl 5.8.8 on Linux, x86_64, if that's relevant.)
Re: How to grab Parse::RecDescent error output in a variable? by w-ber (Hermit) on Sep 12, 2008 at 08:44 UTC |
Answering my own question, here is one way to do it:
my $ParseErrorFh;
BEGIN {
open(my $olderr, '>&', STDERR) or die "Cannot dup STDERR: $!";
close STDERR;
open(STDERR, '+>', undef) or die "Cannot open anonymous file: $!";
select STDERR; $| = 1;
open($ParseErrorFh, '>&STDERR') or die "Cannot dup anoynomous file
+: $!";
# Cannot use, since that messes up execution order for some reason
+ (since use
# implies a BEGIN block itself?).
require Parse::RecDescent;
close STDERR;
open(STDERR, '>&', $olderr) or die "Cannot restore STDERR: $!";
}
sub report_error {
seek($ParseErrorFh, 0, 0);
die join '', grep { $_ !~ m/^\s*$/ } <$ParseErrorFh>;
}
sub parse {
my ($grammar, $str) = @_;
seek($ParseErrorFh, 0, 0); # To ensure it's not growing indefinite
+ly, as it is package global.
my $p = Parse::RecDescent->new($grammar) or report_error();
seek($ParseErrorFh, 0, 0);
return $p->startrule($str) or report_error();
}
Of course, this is not thread-safe, and it is also quite ugly. Other solutions?
| [reply] [d/l] |
Re: How to grab Parse::RecDescent error output in a variable? by ikegami (Pope) on Sep 12, 2008 at 10:20 UTC |
However, fine as it is though hairy, this will not produce any content in $ParseError
open '>&' results in a dup system call, so it only works on file handles that have a file descriptor. File handles to scalars aren't real file handles, so the OS doesn't know about them, so the OS can't dup them.
Since the ERROR filehandle is defined at module use time, it is necessary to do STDERR munging in a BEGIN block
Actually, it occurs at module execution time. The module is only executed the first time the module is required.
Anyway, it would be cleaner to re-open Parse::RecDescent::ERROR.
use Parse::RecDescent qw( );
...
open(local *Parse::RecDescent::ERROR, '>', \my $error);
my $p = Parse::RecDescent->new($grammar) or die "Invalid grammar";
my $output = $p->startrule($str) or die $ParseError;
This method
- provides better encapsulation,
- will allow you to capture error output to a scalar since the dup is taken out of the equation, and
- should be thread safe if you capture to a scalar (since no system file handles are involved).
since use implies a BEGIN block itself?
Did you not read use at that point?
use Parse::RecDescent;
is equivalent to
BEGIN { require Parse::RecDescent; Parse::RecDescent->import(); }
And yes, BEGIN blocks can be nested. A BEGIN is executed as soon as it is fully compiled, so the following prints "abcde":
print("d");
BEGIN {
print("b");
BEGIN {
print("a");
} # inner BEGIN executed after this line is compiled
print("c");
} # outer BEGIN executed after this line is compiled
print("e");
# outermost scope executed after this line is compiled
| [reply] [d/l] [select] |
|
About the use of require in this example... Here is a complete
working example that captures Parse::RecDescent output in a dup'd anonymous
filehandle, which shows that use and require are
not equivalent in this case.
Anyway, it would be cleaner to re-open
Parse::RecDescent::ERROR.
Yes, it definitely would be. This is what I tried first. Here is a complete
example (did you try to run your own example code?):
use strict;
use Parse::RecDescent;
sub parse {
my ($grammar, $str) = @_;
open(local *Parse::RecDescent::ERROR, '>', \my $error)
or die "Cannot open in-memory filehandle: $!";
local $::RD_ERROR = 1;
local $::RD_WARN = 2;
my $p = Parse::RecDescent->new($grammar)
or die "Grammar is invalid";
my $x = $p->start($str);
defined $x or die $error;
return $x;
}
print parse('start: /foo/ | <error>', 'fo'), "\n";
Result:
$ perl recdescent3.pl
Undefined format "Parse::RecDescent::ERROR" called at /usr/share/perl5
+/Parse/RecDescent.pm line 2910.
Clearly format has a side-effect that prevents the use of the
nice solution this time. Besides that, your re-open of ERROR
requires knowledge of the package internals, while redirecting
STDERR requires arguably less knowledge, and certainly not the
name of a private (albeit package global) variable.
| [reply] [d/l] [select] |
|
BEGIN { require Module; import Module; }
And that is clearly the case this time.
Besides that, your re-open of ERROR requires knowledge of the package internals
So does knowing STDERR is duped at execution time. Besides, the benefits far outweigh the drawbacks. Well, if it had worked.
- Benefits
-
- Works with scalar handles.
- Much cleaner code in the caller.
- In fact, one can arrange to have no extra code in the caller since it can be placed in the "grammar".
- Works even if use Parse::RecDescent; is executed twice.
- Works with threads.
- Drawbacks
-
- Minor reliance on stable PRD guts.
- Doesn't work. Oops!
Clearly format has a side-effect that prevents the use of the nice solution this time.
Ah dang! You could issue the format on the new handle, but that's going pretty far into the inards.
| [reply] [d/l] [select] |
|
Curious; as it happens I had to do this for a project I was working on some months back, and came to the same conclusion as ikegami. My code reads thusly (copied and pasted):
open( *Parse::RecDescent::ERROR, '>', \(my $parse_error) )
or croak("Error: unable to redirect SDTERR.");
$Parse::RecDescent::skip = ' *\x{0} *';
$::RD_ERRORS++;
$::RD_WARN++;
$::RD_HINT++;
my $parser = Parse::RecDescent->new($grammar) or die("Bad grammar!
+");
I can assure you that does actually work for me (perl 5.8.8, P::RD 1.94). My best guess is that your use of local on that open call is maybe creating problems?
The dependency on P::RD internals always bothered me as well, but I never got around to figuring out a better way to do this.
Cheers, Tim
Update: Yup, turns out if you remove that local then it works:
use strict;
use Parse::RecDescent;
sub parse {
my ($grammar, $str) = @_;
open(*Parse::RecDescent::ERROR, '>', \my $error)
or die "Cannot open in-memory filehandle: $!";
local $::RD_ERROR = 1;
local $::RD_WARN = 2;
my $p = Parse::RecDescent->new($grammar)
or die "Grammar is invalid";
my $x = $p->start($str);
defined $x or die "CAPTURED: $error";
return $x;
}
print parse('start: /foo/ | <error>', 'fo'), "\n";
| [reply] [d/l] [select] |
|
|
|
|