Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Capturing output from a long-running pipeline

by wanna_code_perl (Friar)
on May 23, 2017 at 16:56 UTC ( [id://1190999]=perlquestion: print w/replies, xml ) Need Help??

wanna_code_perl has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I've mostly written a Perl program that runs several external backup and file transfer programs, such as duplicity, scp, rsync, etc. Everything's fine except I'm not sure of the best way to actually execute the commands. I'll show you the relevant subroutine, then I'll get into specifics:

# Run executable @system command, reporting name as $name If there # is an upload limit, we run through trickle(1) to limit bandwidth. sub _ext_cmd { my ($name, @system) = @_; if ($o{general}{upload}) { @system = ($o{general}{trickle}, $o{general}{trickled} ? () : '-s', -u => $o{general}{upload}, '|', @system); } say "Running name=$name, @system"; # system { $system[0] } @system[1..$#system]; # Obviously nope. say "\$?=$?"; }
# Excerpt of %o options hash: %o = ( general => { upload => '256', # KBps trickle => '/bin/trickle', trickled => undef, # True if trickled is installed }, ); # Example usage: _ext_cmd(display_name => qw!/bin/duplicity /path/to/src /path/to/dest! +); # ... but you could probably test it just fine with echo or /bin/cat.

A couple of important points:

  1. The commands will often be in a pipeline with the bandwidth-limiting utility trickle(1)
  2. In general, the commands are long-running (hours, even days for first backups) but not indefinite. Several megabytes of output are typically generated.
  3. stdout and stderr need to be logged to file (captured might work, although I'll be keeping the logs for later browsing, and capturing gains me nothing except wasted RAM). Exit codes are important, too.
  4. As usual, I'd rather not have to escape spaces and the like... thus passing arguments in an array is preferred.
  5. CPAN modules are fine, if necessary.

Right, that was more than a couple points. I swear I've done this a hundred times before, but for whatever reason my distracted brain doesn't want to put this puzzle together on its own today.

Replies are listed 'Best First'.
Re: Capturing output from a long-running pipeline
by talexb (Chancellor) on May 23, 2017 at 17:56 UTC

    I'm not exactly sure what question you're asking, so I'll barge ahead and make my comment.

    I'm not a big fan of trusting that the caller will re-direct stdout and stderr to the appropriate places -- I much prefer to handle that type of thing within the program. That way, I can time-stamp the filenames so that multiple runs of a script don't leave me with just the last run's results -- because the output and error filenames are hard-coded.

    Once you have the output or error log, you can slice and dice to your heart's content.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      Thanks very much for the reply.

      I'm not exactly sure what question you're asking, so I'll barge ahead and make my comment.

      I'm asking for the best way to invoke an external command with a complicated command pipeline and arguments (see my enumerated list for details). Sorry if that wasn't clear. It's been a rather long and dark day.

      I much prefer to handle that type of thing within the program.

      This does at least answer part of my question—I can redirect the standard streams easily enough. Since you suggest going pure-Perl with this bit, do you have a good/easy way to combine stdout and stderr similar to a  > file.log 2>&1 as in a shell?

        If you are combining stdout and stderr on the command line, then you can do the equivalent thing by opening a single file handle and writing to that during execution. The long-running (only a few minutes) script I'm currently working on just writes log messages to what I'm calling the error log, and a very few status messages really do go to stdout. The stuff to stdout is just to reassure myself that the script's actually doing something, and is throwaway stuff. The error log contains valuable information, and I find myself grepping for various phrases, or for specific policy/coverage combinations in case I want to track how a specific case was handled.

        And from a philosophical point of view, I'm not a fan of combining stdout and stderr into a single stream, because they likely contain different streams of information .. stdout is going to contain boring things like "I'm doing foo now ..", while stderr might contain more important stuff like "Rule 17 broken in record 34567, bf=17.76" .. but again, this is totally up to you as the developer/sysadmin.

        OK, so that might have been a little off-topic .. if you are working on running some commands and collecting the output, I believe qx is the command you want; it runs the command, returns a list of lines output. I would probably log that as

        Running <some command>, output is >> Output line 1 >> More output >> final output
        This may or may not be what you want, as you don't get the output until the command has finished. I you'd rather have output back from the command as it runs, then you may have to go to something like IPC::Run. I see there's even something called IPC::Run::Fused which glues stdout and stderr together.

        Anyway, have a look at those modules -- I used the former many years ago, and it worked brilliantly.

        Alex / talexb / Toronto

        Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1190999]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (2)
As of 2024-04-20 03:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found