Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

system() call mis-directs?

by Clarkman (Novice)
on Mar 02, 2020 at 21:11 UTC ( #11113664=perlquestion: print w/replies, xml ) Need Help??

Clarkman has asked for the wisdom of the Perl Monks concerning the following question:

Hello! We are experiencing some re-direct issues with the system call. Hard thing is, the behavior is not determinate. We run a script thus:
/usr/bin/nohup ./do-upgrades.pl > do-upgrades.log &
Then inside the script we fork off, and use system() to spawn another child process like this:
if(defined ($f) and $f==0) { # I'm a child my $result = system("./upgrade.pl $sensor >> worklog 2>&1"); print "result from upgrade script for $sensor was $result.\n"; exit(0); }
We then waitpid inside the forked process until upgrade.pl exits. Been stable for years. But, as we have drilled down into the logs, we have found that in some cases the upgrade.pl child process starts out writing to its parent process log (do-upgrades.log), and in other instances it writes to worklog correctly as directed. Anyone have ideas how this is happening? THANKS!

Replies are listed 'Best First'.
Re: system() call mis-directs?
by jcb (Priest) on Mar 03, 2020 at 00:47 UTC

    Check the values of $sensor for which the error occurs. Do they contain other bits of shell syntax such as ampersands, semicolons, or a trailing backslash? Any of those could cause the shell to misparse the remainder of the command.

    Try instead: (obviously untested)

    my $result = system(q[./upgrade.pl "].quotemeta($sensor).q[" >>worklo +g 2>&1]);
      quotemeta($sensor)

      Sorry, but quotemeta is not an appropriate shell quoting function - it may work in some cases, but will fail in others. I would strongly recommend e.g. ShellQuote::Any instead, or of course to avoid the shell in the first place.

        ShellQuote::Any

        The code currently looks like this:

        package ShellQuote::Any; use strict; use warnings; our $VERSION = '0.04'; sub import { my $caller = caller; no strict 'refs'; ## no critic *{"${caller}::shell_quote"} = \&shell_quote; } sub shell_quote { my ($cmd, $os) = @_; if ($os) { _require($os); return _is_win32($os) ? _win32_quote($cmd) : _bourne_quote($cm +d); } else { _q()->($cmd); } } my $Q; sub _q { return $Q if $Q; _require($^O); if ( _is_win32($^O) ) { $Q = \&_win32_quote; } else { $Q = \&_bourne_quote; } return $Q; } my %K; sub _require { my ($os) = @_; return $K{$os} if $K{$os}; my $klass = _is_win32($os) ? 'Win32/ShellQuote.pm' : 'String/Shell +Quote.pm'; $K{$os} = $klass; require "$klass"; ## no critic } sub _win32_quote { my ($cmd) = @_; Win32::ShellQuote::cmd_escape(join ' ', @$cmd); } sub _bourne_quote { my ($cmd) = @_; String::ShellQuote::shell_quote(@$cmd); } sub _is_win32 { my ($os) = @_; return $os =~ m!^(?:MS)?Win(?:32)?$!i ? 1 : 0; } 1;

        Just from looking at the names of the functions, it has only two modes of operation: Either Windows, or Bourne Shell. No code for handling cygwin, OS/2 or DOS, so those three will be treated as Bourne Shell, which is plain wrong.

        On Windows, you simply can not win the quoting game. Any code pretending to be able to do so is broken.

        Assuming that all other systems use a bourne-compatible shell as default shell is at least optimistic. Severals systems have a korn shell as default, others even use a csh. See Various system shells.

        Looking at the documentation of String::ShellQuote, handling the non-Windows part, the problem is obvious without even reading the source:

        BUGS

        Only Bourne shell quoting is supported. I'd like to add other shells (particularly cmd.exe), but I'm not familiar with them. It would be a big help if somebody supplied the details.

        It seems nobody cared during the last 10 years, and the author either had no time to RTFM, or had no idea where to search for shell man pages.

        So, with ShellQuote::Any, you are essentially lost if the current default shell is no bourne shell. I did not bother to read the quoting code in String::ShellQuote. As is, it is broken anyway, and with it ShellQuote::Any. Yes, it may work on many systems in their default configuration, but it won't do its job on all systems. It does not even prevent stupid things from happening on unknown systems.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        I currently believe that quotemeta will work within the context of a double-quoted argument with the Bourne shell. I was concerned mostly with ensuring that any quotes embedded in $sensor would not terminate the quoted argument early (I suspect that some of the $sensor values contain characters that separate commands) and quotemeta seems to be a tool usable for this special case. If I am wrong, please provide a counter-example where quotemeta fails to cause a string to be passed exactly as written as an argument. That code was written off the top of my head and not tested so I could be very wrong here...

        In this case, it is wrong anyway because $sensor consists of multiple arguments that the shell is expected to split, and I did immediately suggest finding a solution that does not involve using the shell after an example input was provided.

      Thank you both!

      @sundialsvc4, thank you but logs have been checked and re-checked.

      @jcb, you may have nailed it.

      $sensors looks like this: ABC-00-DEF-1234 --sitename=Charon .

      The double hyphen is suspicious as is the '='

      Will test out tomorrow and post results. Thank you !

        Your sample does not contain any troublesome characters, but the shell splits on unquoted whitespace, which my earlier suggestion would prevent by wrapping $sensor in shell quotes before passing it to the shell. My immediate suspicion is that you may have some "odd" site names. In particular, an ampersand is both reasonable in human-readable names and special to the shell.

        The best way to fix this would be to get the shell out of the picture and use the list form of system. You will also need to use open or sysopen to rearrange your filehandles so that the second child will inherit STDOUT and STDERR opened for append on the worklog file, while the first child keeps its original STDOUT somewhere to report the result from the second child.

      A reply falls below the community's threshold of quality. You may see it by logging in.
      Thanks a tremendous amount everyone. We figured it out, rather we re-factored some 6 year old code. This was the process tree showing PID and PPID at the beginning of every line:
      1234 1 do-upgrades.pl. <--- daemon 2345 1234 do-upgrades.pl <--- forked process with system() call 2346 2345 upgrade-sensor.pl. <--- worker bee
      Every time the daemon while(1) looped forever, it stopped over at waitpid:
      # read everything from the pipe. my $pipename = "sensor-fifo"; while (1) { chomp(my $sensor = `cat $pipename`); # Something in pipe; see if it is real or just empty string # [Omitted more sophisticated tests here] if ($sensor) { print "upgrade requested for $sensor, forking...\n"; my $f=fork; if(defined ($f) and $f==0) { # I'm a child # Here you can the shell meta-characters my $result = system("${perlhome}/upgrade-sensor.pl $sensor > +> ${workpath}/worklog 2>&1"); # System is a blocking call, meaning that the # intermediate process does not exit. print "result for $sensor was $result.\n"; exit(0); } elsif (defined ($f)==0) { # Fork failed print "Fork failed for $sensor!\n"; } # else I'm the parent } print "waiting on child $$ to complete $sensor.\n"; 1 while waitpid(-1, WNOHANG)>0; # reaps children print "Finished waiting.\n"; }
      The meaning of the intermediate process was not clear. Upon completion of the system() call, it prints and exits. Its sole purpose seems to allow catching the $result into the daemon's log, but we don't care. So we switched over to exec():
      # read everything from the pipe. my $pipename = "sensor-fifo"; while (1) { chomp(my $sensor = `cat $pipename`); my @args = split / /,$sensor; # Something in pipe; see if it is real or just empty string # [Omitted more sophisticated tests here] if ($sensor) { print "upgrade requested for $sensor, forking...\n"; my $f=fork; if(defined ($f) and $f==0) { # I'm a child exec ("${perlhome}/upgrade-sensor.pl", @args) } elsif (defined ($f)==0) { # Fork failed print "Fork failed for $sensor!\n"; } # else I'm the parent } print "waiting on child $$ to complete $sensor.\n"; 1 while waitpid(-1, WNOHANG)>0; # reaps children print "Finished waiting.\n"; }
      Now the process tree looks like this:
      1234 1 do-upgrades.pl. <--- daemon 2346 1234 upgrade-sensor.pl. <--- worker bee
      Much simpler. From running the daemon on the command line, we now saw that the output of the fork/exec'd upgrade-sensor.pl process was writing into the daemon's log. More specifically, the daemon's log created thus, when the daemon was launched:
      /usr/bin/nohup ${RFS_HOME}/do-upgrades.pl > ${WORK_PATH}/do.log &
      (Still need to write an init.d/ controller). What we did to make it work (we think) was to kill off dup handles, and re-open. We added the two close commands at the head of the upgrade-sensor.pl script, and now logs are properly separated. Love inheriting code!
      close (STDERR); close (STDOUT); my $logabspath = $logpath . "/" . $logfile; open (STDOUT, "| tee -ai $logabspath"); open (STDERR, "| tee -ai $logabspath"); print "\n\n----------------------------------------------\n";
      What we don't know is whether this is correct, or not. It works on two hosts the same way:
      $ uname -a Linux toolchain 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 +UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
      $ uname -a Linux oldtoolchain 2.6.32-754.17.1.el6.x86_64 #1 SMP Tue Jul 2 12:42:4 +8 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
      so that's good. But these messages are not coming from our code, rather the first issues upon calling fork. The second issues from waitpid.
      Child forked (pid=14825), 1 processes running now at ./do-upgrades.pl +line 49.
      Unknow process 0 has exited, ignoring it at ./do-upgrades.pl line 84
      Yes, it is ->Unknow<-. Thanks again - Clarkman
        OK, the previously unknown elephant in the room is Proc:Queue. It overlays fork and waited, and has those print outs in it.

        You are running some tee processes that you should not need with that. Try something more like: (untested)

        # no need to close STDERR or STDOUT first; reopening will do that impl +icitly open STDERR, '>>', $logpath.'/'.$logfile or die "redirect STDERR >> $logpath/$logfile: $!"; $| = 1; open STDOUT, '>>&=', \*STDERR or die "redirect STDOUT -> STDERR: $!"; $| = 1; print "\n\n", '-' x 45, "\n";
Re: system() call mis-directs?
by haukex (Bishop) on Mar 03, 2020 at 07:42 UTC
    Then inside the script we fork off ... We then waitpid inside the forked process

    This sounds a lot like what system already does. Why do this twice?

    we have found that in some cases the upgrade.pl child process starts out writing to its parent process log (do-upgrades.log), and in other instances it writes to worklog correctly as directed

    Obviously, if you could narrow this down in which cases this happens (as others have pointed out, e.g. for what values of $sensor) more and make it more reproducible, that would be the best thing.

    system("./upgrade.pl $sensor >> worklog 2>&1")

    This kind of thing may work when $sensor is tightly controlled (see my node here!) and the system it is being run on never changes (The problem of "the" default shell), but I would still strongly recommend against it. Also, what is the first line of upgrade.pl? If it's #!/usr/bin/env perl, its execution is dependent on PATH. Anyway, if upgrade.pl is a short-lived process, then I'd recommend IPC::Run3 instead; if it is a longer process that needs to run in the background while your script continues, then IPC::Run. Both of these modules allow to avoid the shell and can still redirect STDOUT and STDERR.

A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11113664]
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2020-07-02 12:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?