Synchronizing STDERR and STDOUT

Ovid has asked for the wisdom of the Perl Monks concerning the following question:

I've a process (Test::Builder, if you must know), which sends data to both STDERR and STDOUT. However, I need to read both and here are the requirements:

This must be a non-blocking read.
The streams must be read in synch.
It must be cross-platform.
It must run on the oldest version of Perl 5 possible.
It must be pure Perl.

Number 2 seems to require that I use a construct like `$someprocess 2>&1`, but that blocks and I don't know if it's portable. I do know that open FH, "$someprocess 2>&1 |" is blowing up on Windows with a 'bad file descriptor' error.

As far as I can tell, the only way to reliably solve this problem is some way of telling the source process to send everything to the same filehandle. Is there some other way of handling this without changing the source process?

Update: And I cannot use any non-core modules. There's a hope, however faint, that this work might eventually make it's way into the core, hence this and some of the above requirements.

Cheers,
Ovid

New address of my CGI Course.

Comment on Synchronizing STDERR and STDOUT

Replies are listed 'Best First'.
Re: Synchronizing STDERR and STDOUT by nothingmuch (Priest) on Sep 21, 2006 at 11:27 UTC
FWIW, why is TAPx::Parser wrapping around the stream? Why isn't a plumbing loop pulling from the stream and pushing to the parser? Then you don't necessarily need nonblocking reads, etc at the IO level - you could push that down to POE or whatever. XML::Parser has a non blocking interface which i've always liked due to it's simplicity - you just push strings when you have them, and it generates events. If you put in a partial string then the parser's state machine will simply be in that state waiting for more input. This way you can have e.g. TAPx::Parser::Harness::Win32, Socket, POE, *Whatever, all reusing the parser without needing to model an iterator API around the various platform specific quirks. Update: to clarify that last part - you only truely need non blocking IO if you need to parse multiple streams simultaneously, and as long as the parser has a push api flexible enough to be reentrant (multiple parsers instantiated and with their own state simultaneously) then there's no reason why it can't deliver callbacks in only when it's ready. Update 2: POE::Filter::XML is written over XML::SAX::Expat::Incremental which is basically a SAX wrapper for the ExpatNB interface. That might be a nice example. -nuffin zz zZ Z Z #!perl	[reply]
Re^2: Synchronizing STDERR and STDOUT by Ovid (Cardinal) on Sep 21, 2006 at 13:39 UTC
A couple of comments: this is for TAPx::Parser and one of the requirements is to have nothing which prevents it from installing in a fresh Perl install (i.e., no external dependencies), that means POE and friends are out of the question. Even if I make them optional, that doesn't solve my root problem :) As for the plumbing loop, while I do like that idea, this thing is rapidly becoming far more complex than desired. Too many layers of indirection/abstraction are going to make this unmanageable. Already I have one major design flaw because of this problem and it's slowing down development. I might consider this option if I have no choice, but for now, I just want to pull from a stream. However, comments that others are making are rapidly convincing me that the synchronization issue can't be solved downstream. I do, however, have ideas on how to solve that little nit. Cheers, Ovid New address of my CGI Course.	[reply]
Re^3: Synchronizing STDERR and STDOUT by nothingmuch (Priest) on Sep 21, 2006 at 18:37 UTC
I was suggesting you remove a layer of abstraction - instead of keeping the stream loop inside of TAPx::Parser, let it remain outside so that it doesn't have to worry at all about blocking and synchronization and what not. The POE part is just to demonstrate that this approach (push parsing) works well in more situations than stream based iterators (that is, streams are usable with push parsers, but not vice versa). -nuffin zz zZ Z Z #!perl	[reply]
Re^2: Synchronizing STDERR and STDOUT by nicholasrperez (Monk) on Sep 21, 2006 at 13:48 UTC
You get ten points for pimpin' my code ;)	[reply]
Re: Synchronizing STDERR and STDOUT by shmem (Chancellor) on Sep 21, 2006 at 10:33 UTC
What about doing that before running Test::Builder - `open(STDERR,'>&', STDOUT);` [download] Is that portable? --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply] [d/l]
Re^2: Synchronizing STDERR and STDOUT by Ovid (Cardinal) on Sep 21, 2006 at 10:43 UTC
That's been suggested, but it doesn't work :( From perlfaq8: Note that you cannot simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work: open(STDERR, ">&STDOUT"); $alloutput = `cmd args`; # stderr still escapes [download] This fails because the open() makes STDERR go to where STDOUT was going at the time of the open(). The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT). Cheers, Ovid New address of my CGI Course.	[reply] [d/l]
Re^3: Synchronizing STDERR and STDOUT by shmem (Chancellor) on Sep 21, 2006 at 11:27 UTC
That's one specialty for backticks only. With backticks, a new filehandle is allocated into which the STDOUT of the subprocess is diverted. But the STDERR of the subshell goes to your STDOUT. Yes, the redirect has to be done in the source process, unless you patch your kernel with a MacFilehandle patch (three button -> one button :-) which lumps STDOUT and STDERR together at will. Within the same perl process filehandles it's all fine: `#!/usr/bin/perl -w use strict; # $Id: blorfl.pl,v 0.0 2006/09/21 11:11:11 shmem Exp $ print "foo"; warn "warn"; print "\n"; __END__` [download] `qwurx [shmem] ~> perl -e 'open(STDERR,">&", STDOUT); do "blorfl.pl"' 1 +>/dev/null qwurx [shmem] ~> perl -e 'open(STDERR,">&", STDOUT); do "blorfl.pl"' 2 +>/dev/null foo warn at blorfl.pl line 5.` [download] But a subprocess invoked has two brand new filehandles for STDOUT and STDERR, which happen to be connected to the same filehandle in the parent (which the subshell doesn't know), but the process is free to buffer at lib. You have to do something with the source process, at least to have it make STDOUT unbuffered if you want the two streams in synch. `qwurx [shmem] ~> perl -le 'open(STDERR,">&", STDOUT); system "perl blo +rfl.pl"' 1>/dev/null qwurx [shmem] ~> perl -le 'open(STDERR,">&", STDOUT); system "perl blo +rfl.pl"' 2>/dev/null warn at blorfl.pl line 5. foo` [download] While redirection works as expected, note the reverse order of 'warn' and 'foo' due to buffered STDOUT. <update> BTW, the FAQ entry you quoted should read like this for clarity This fails because the open() makes STDERR go to where STDOUT was going at the time of the open(). The backticks then make the subshell's STDOUT go to a string, but don't change the subshell's STDERR (which still goes to the old STDOUT). <update> --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply] [d/l] [select]
Re^3: Synchronizing STDERR and STDOUT by nothingmuch (Priest) on Sep 21, 2006 at 11:10 UTC
You can split the fork and the exec up if open mashes them together too much. `pipe CHILDREAD, CHILWRITE; defined( my $pid = fork ) or die "fork: $!"; if ( $pid ) { # read on CHILDREAD; } else { open STDERR, ">&CHILDWRITE"; open STDOUT, ">&CHILDWRITE"; exec( $somecmd ); }` [download] This is precisely the type of plumbing that a shell will do when you say 2>&1, except without the unportable syntax ;-) That said, IPC::Run and friends already abstract all of this out, so there's no need to reinvent the wheel. -nuffin zz zZ Z Z #!perl	[reply] [d/l]
Re^4: Synchronizing STDERR and STDOUT by Ovid (Cardinal) on Sep 21, 2006 at 11:12 UTC
Re^5: Synchronizing STDERR and STDOUT by nothingmuch (Priest) on Sep 21, 2006 at 11:20 UTC
Re^3: Synchronizing STDERR and STDOUT by xdg (Monsignor) on Sep 21, 2006 at 15:58 UTC
This fails because the open() makes STDERR go to where STDOUT was going at the time of the open(). The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT). So don't use backticks. Redirect STDOUT to a file, redirect STDERR to STDOUT and use `system()` instead. use strict; use warnings; use File::Temp; my $temp_stdout = File::Temp->new; local OLDOUT; local OLDERR; open( OLDOUT, ">&STDOUT" ); open( OLDERR, ">&STDERR" ); open( STDOUT, ">$temp_stdout" ); open( STDERR, ">&STDOUT" ); # Funky quoting for Windows. Sigh. system('perl -e "print q{to stdout}; warn q{to stderr}; print q{more t +o stdout}'); close(STDOUT); open(STDOUT, ">&OLDOUT"); open(STDERR, ">&OLDERR"); open CAPTURED, "<$temp_stdout"; my $capture = do { local $/; <CAPTURED> }; close CAPTURED; print "Got this:\n$capture"; [download] That still doesn't solve the problem of keeping them in sync because the subprocess still has two buffered handles. The fact that they go to the same place doesn't matter. You need to get the child process to turn off buffering. -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l] [select]
Re: Synchronizing STDERR and STDOUT by coreolyn (Parson) on Sep 21, 2006 at 13:56 UTC
While there's much that could be cleaned up, and in spite of my initial gut feeling in Open3 and bad gut feeling, This open 3 module has provided me what you seek in several large enterprise applications without any problems. Here's the code as it's been running in production without change for the last 4 years Read more... (9 kB) coreolyn Edited by planetscape - added readmore tags Read more... view votes (25 Bytes)	[reply] [d/l]
Re^2: Synchronizing STDERR and STDOUT by OfficeLinebacker (Chaplain) on Sep 21, 2006 at 14:27 UTC
coreolyn, sweet stuff. I don't fully understand it yet, but the "Suffering from Buffering" article is proving to be a nice read. Does your program keep the stdout and stderr output in such a way that you can reassemble it in the order it actually was generated? Terrence _________________________________________________________________________________ I like computer programming because it's like Legos for the mind.	[reply]
Re^3: Synchronizing STDERR and STDOUT by coreolyn (Parson) on Sep 21, 2006 at 15:31 UTC
I use two separate logs for stderr and stdout. To be completely honest I haven't had a situation where I've needed to make sure they are completely in sync. So maybe I was overly smug in my assumption. You should read the comments that are included in the full node of Open3 and bad gut feeling apparently there's is a lot of extraneous ( read useless code ). I never was given an opportunity, nor did the need arise to refactor it. coreolyn	[reply]
Re: Synchronizing STDERR and STDOUT by monarch (Priest) on Sep 21, 2006 at 11:10 UTC
What version of Windows are you running on? On Windows XP Professional I can do the following: Read more... (511 Bytes) which outputs the following: Read more... (295 Bytes)	[reply] [d/l] [select]
Re^2: Synchronizing STDERR and STDOUT by Ovid (Cardinal) on Sep 21, 2006 at 11:13 UTC
There's no particular version of Windows. This is for TAPx::Parser which, hopefully, will run anywhere that Perl can run. Thus, I can't rely on any particular version of Windows. Hence my need to keep this as portable as possible. Cheers, Ovid New address of my CGI Course.	[reply]
Re: Synchronizing STDERR and STDOUT by OfficeLinebacker (Chaplain) on Sep 21, 2006 at 12:30 UTC
Greetings, esteemed monks! Isn't this one of those currently unsolvable problems? I've been trying to separately but synchronously handle STDOUT and STDERR for months now. The closest I've gotten is (big program follows) Read more... IPC::Run usage (8 kB)	[reply] [d/l] [select]
Re: Synchronizing STDERR and STDOUT by ikegami (Patriarch) on Sep 21, 2006 at 15:39 UTC
some way of telling the source process to send everything to the same filehandle `open(STDERR,'>&', STDOUT);` creates a new filehandle (so it doesn't help), but `STDERR = STDOUT;` makes both STDOUT and STDERR refer to the same filehandle. You don't even need to turn off buffering. `use IO::Handle (); open(STDERR,'>&', STDOUT); print("STDOUT = ", fileno(STDOUT), "\n"); # 1 print("STDERR = ", fileno(STDERR), "\n"); # 2 print STDOUT 'a'; print STDERR 'b'; print STDOUT 'c'; STDOUT->flush(); # ac STDERR->flush(); # b print("\n"); STDERR = STDOUT; print("STDOUT = ", fileno(STDOUT), "\n"); # 1 print("STDERR = ", fileno(STDERR), "\n"); # 1 print STDOUT 'a'; print STDERR 'b'; print STDOUT 'c'; STDOUT->flush(); # abc STDERR->flush();` [download] Of course, this doesn't work if you fork. If you can't modify the program, you can replace `perl script.pl` with `perl -e "STDERR = STDOUT; do 'script.pl'"`	[reply] [d/l] [select]

Back to Seekers of Perl Wisdom