Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

system() implementation on Windows (again)

by Anonymous Monk
on Aug 18, 2011 at 12:05 UTC ( #920941=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I noticed a strange behaviour of the system() function on windows, and I would like to understand how _exactly_ is system() implemented on Win OS'es...

According to perl's documentation, system() can be used in two ways: either passing it a string as a _single_ argument, which contains the external program to execute plus all needed arguments. E.g.:

system("my_program arg1 arg2 arg3");

Or you can pass it an array, of which the first element is then taken as the program name to execute, and the remaining elements are passed as separate arguments to the program.

system("my_program", "arg1", "arg2", "arg3");

According to docs, the difference between the first usage and the second usage is that when passing a single string, perl first checks if the string contains any shell metacharacters, and if it does - system() calls the shell and passes it the whole string as command. In the second usage, system() never calls any OS shell, but directly executes the program and passes the array elements as separate arguments, EXACTLY as they are. Regardless if the array elements contain spaces or special characters or anything, each of them is passed to the external program as one and exactly one argument.

Now... This is true, and works perfectly well on Unix. But not on Windows. As an example, let's use a simple executable, that (I think) is available on every windows machine: msg.exe. It can take some options and arguments, but basically it presents a message box with some text to the user. In the simplest form, you can manually execute this on the command line:

msg username "some text"

where "username" is your windows ser name, and it will display a message box with the text "some text". (Btw. the command works even without the double-quotes around "some text".) But now to the problem with perl's system() function. My goal is to execute an external program, and be able to pass it my arguments EXACTLY as they are. no interpretation whatsoever. Since I know that the first usage of system() - i.e. giving it the full command with arguments as one string - might involve the local OS shell, I decide to not use it. But use the array form instead. So first check this sample code:

my $txt = 'sample text with some / special /w characters > xyz'; system('msg', 'username', $txt);

This works OK. The whole string in $txt is diplayed in the message box, and it looks like system() is working correctly. But this code:

my $txt = '/w sample text with some / special /w characters > xyz'; system('msg', 'username', $txt);

does not work anymore! The only difference is the /w at the beginning of the message text. But it shows that something strange is happening behind the scenes, when system() runs. /w is an option for msg.exe which tells it to wait until user clicks OK. And in fact when I execute the perl script, I can see two things: First, the message box does not display the whole text that is in $txt, it only starts with the word "sample". And second: the command line prompt does not return immediately, it only comes back when I click on OK. (This is because the script waits for system() to complete, system() waits for msg.exe to complete, and msg.exe waits for my click.

Sooo... Reading perl docs, I would expect that system() executes my program (msg.exe), and passes it the "username" as first argument, and the complete $txt content (whatever characters it might contain!) as a second argument. Instead, it looks like system() still builds one long command first, using the array elements provided, and then somehow executes all of it: (although I do not see any additional shell i.e. cmd.exe running in the task manager)

msg username /w sample text with some / special /w characters > xyz

This is in my opinion incorrect! And the big question is: What does system() actually do behind the scenes, because what I ultimately need, is the possibility to pass _arbitrary_ arguments to my external program, even with special characters etc.

BTW. I tried ActivePerl and Strawberry perl, and they both have the same issue.

If there are any monks out there, who know how this stuff is exactly implemented on Windows, I would greatly appreciate an answer.

Comment on system() implementation on Windows (again)
Select or Download Code
Re: system() implementation on Windows (again)
by muba (Priest) on Aug 18, 2011 at 12:27 UTC

    Disclaimer: I don't have enough inside knowledge to answer your real question, nor do I have anything to prove my following notion.

    What could also be the case is that msg.exe does its own command line argument parsing. On my machine (Windows XP, Dutch) I get this:

    C:\>msg /? Een bericht naar de gebruiker sturen. MSG {gebruikersnaam | sessiennaam | sessie-ID | @bestandsnaam | *} [/SERVER:servernaam] [/TIME:seconds] [/V] [/W] [bericht]

    Liberally translated, this says that msg.exe takes the following arguments:

    1. Either a user name, a session name, a session ID, a "@" + file name, or a "*"
    2. Optional /SERVER:, /TIME:, /V, and /W switches
    3. Optionally a message
    I suspect that what happens here is that msg.exe looks at all arguments it gets, sees that there is a "/w" before the message1, and waits for user response.

    1 Or, more precisely, that there is something that could be interpreted as a "/w" before the message, even though that wasn't the intended meaning.

      OK, I understand what you're saying. And it is an interesting observation, which I didn't take into account so far. On the other hand, I observed this behavior also with other executables. In fact, my actual problem is related to another (specific) program, and I used the msg.exe really just to have something simple to demonstrate the issue in a post in this forum. In particular, I noticed that double quotes characters inside one of the passed arguments can destroy the whole execution. Which makes me think, that there is "something" interpreting the arguments, instead of purely passing the argument string "as is" to the executable, at the desired argument position. Anyway, thanks for your feedback!

        Which makes me think, that there is "something" interpreting the arguments, instead of purely passing the argument string "as is" to the executable, at the desired argument position.

        Yeah, its called

        $ perl -V:sh sh='cmd /x /c';
Re: system() implementation on Windows (again) <Nit>
by ww (Bishop) on Aug 18, 2011 at 12:42 UTC
    Interesting question. + +

    Minor, OT Nit: "As an example, let's use a simple executable, that (I think) is available on every windows machine: msg.exe."

    Actually, msg.exe was not part of Win2k. It was in an early (initial?) release of XP.

      It's not on my Vista machine either.

        Microsoft Windows [Version 6.0.6002] Copyright (c) 2006 Microsoft Corporation. All rights reserved. C:\Users\wfsp>msg Send a message to a user. MSG {username | sessionname | sessionid | @filename | *} [/SERVER:servername] [/TIME:seconds] [/V] [/W] [message] username Identifies the specified username. sessionname The name of the session. sessionid The ID of the session. @filename Identifies a file containing a list of usernames +, sessionnames, and sessionids to send the message + to. * Send message to all sessions on specified server +. /SERVER:servername server to contact (default is current). /TIME:seconds Time delay to wait for receiver to acknowledge m +sg. /V Display information about actions being performe +d. /W Wait for response from user, useful with /V. message Message to send. If none specified, prompts for + it or reads from stdin.
        update: I dread to think what you've done to your vista box :-)
Re: system() implementation on Windows (again)
by moritz (Cardinal) on Aug 18, 2011 at 14:53 UTC

    I recommend to create a small Perl script:

    # file joined-echo.pl print join('||', @ARGV), "\n";

    And use system($^X, 'joined-echo.pl', @rest_of_your_args) for testing - that way you know exactly that the command you're testing doesn't do any magical argument processing on its own.

      Yes, tried that. And guess what... I get arguments split (by spaces) before the executable specified in the system() call is run.

      So if this is the content of my "main" perl script:

      my $txt = '/w my > text >= [with] \ symbols " /wabc/def (xyz)'; system($^X, 'joined-echo.pl', $txt, 'ARG2');

      ...Then this is the output:

      /w||my||>||text||>=||[with]||\||symbols|| /wabc/def (xyz) ARG2

      As one can see, the perl called by system() does not get 2 arguments - as I would expect - but much more. Mhmm.

        See exec always invokes the shell? win32, the situation is brainfuck-cubed

        The way I get around it, is to always pretend like I'm typing in cmd.exe

        #!/usr/bin/perl -- use strict; use warnings; if( @ARGV ){ print join "\n", map({"( $_ )"} @ARGV), "\n"; } else { my $txt = '/w my > text >= [with] \ symbols " /wabc/def (xyz)'; my( @args ) = win32_quote( 'perl', __FILE__, $txt, 'ARG2', ); print "YOU CAN TYPE THIS AT THE cmd.exe PROMPT\n @args\n\n"; ## command.com doesn't like it (it wants "perl" to be perl) ## i don't know what powershell does :) system @args; } sub win32_quote { my( @args ) = @_; s~ ( [%><|&^"] ) ~ { '%' => '^%', '>' => '^>', '<' => '^<', '"' => '\\"', '&' => '^&', '|' => '^|', }->{$1} ~gex for @args; $_=qq["$_"] for @args; return @args; } __END__ D:\>perl win32.quote.pl YOU CAN TYPE THIS AT THE cmd.exe PROMPT "perl" "win32.quote.pl" "/w my ^> text ^>= [with] \ symbols \" /wab +c/def (xyz)" "ARG2" ( /w my ^> text ^>= [with] \ symbols " /wabc/def (xyz) ) ( ARG2 ) D:\>"perl" "win32.quote.pl" "/w my ^> text ^>= [with] \ symbols \" /wa +bc/def (xyz)" "ARG2" ( /w my ^> text ^>= [with] \ symbols " /wabc/def (xyz) ) ( ARG2 )
Re: system() implementation on Windows (again)
by ikegami (Pope) on Aug 18, 2011 at 18:15 UTC

    Glancing at your post, I see

    my $txt = '/w sample text with some / special /w characters > xyz'; system('msg', 'username', $txt);

    Is that the one that's giving you problems? That's suppose to be equivalent to

    msg username "/w sample text with some / special /w characters > xyz"

    Which is obviously incorrect. Does that answer your question? I gotta run without looking at your post in detail at this time.

Re: system() implementation on Windows (again)
by Anonymous Monk on Aug 19, 2011 at 14:44 UTC
    OK, guys. We can close the discussion right here. I have the answer.

    And the sad truth is there is no way to do it on Windows, due to how CreateProcess works.

    Many Thanks to all who contributed. Especially this post: http://www.perlmonks.org/?node_id=887534 to which someone has pointed, was extremely helpful.

    Thank you!

Reaped: Re: system() implementation on Windows (again)
by NodeReaper (Curate) on Aug 20, 2011 at 03:24 UTC
Reaped: Re: system() implementation on Windows (again)
by NodeReaper (Curate) on Aug 21, 2011 at 02:40 UTC
Re: system() implementation on Windows (again)
by Haarg (Chaplain) on Aug 21, 2011 at 10:43 UTC

    There are two problems here, and other nodes have covered parts of them.

    First is that argument lists are always passed as a single string in Windows, as opposed to arrays on other systems. This is less of a problem than it appears, because 95% of programs use the same rules for parsing that string into an array. Roughly speaking, the rules are that arguments can be quoted with double quotes, and backslashes can escape any character.

    The second issue is that cmd.exe uses different quoting rules than the normal parsing routine. It uses caret as the escape character instead of backslash.

    The result of this is that you can't create a string that will be treated the same for both of these cases. This means you have to quote strings differently depending on if they have shell meta-characters or not. And there isn't any good way to check that without reimplementing the code to detect them that exists inside perl. So here is a routine that will quote arguments correctly to use with system:

    sub quote_list { my (@args) = @_; my $args = join ' ', map { quote_literal($_) } @args; if (_has_shell_metachars($args)) { # cmd.exe treats quotes differently from normal argument parsi +ng. # just escape everything using ^. $args =~ s/([()%!^"<>&|])/^$1/g; } return $args; } sub quote_literal { my ($text) = @_; # basic argument quoting. uses backslashes and quotes to escape # everything. if ($text ne '' && $text !~ /[ \t\n\v"]/) { # no quoting needed } else { my @text = split '', $text; $text = q{"}; for (my $i = 0; ; $i++) { my $bs_count = 0; while ( $i < @text && $text[$i] eq "\\" ) { $i++; $bs_count++; } if ($i > $#text) { $text .= "\\" x ($bs_count * 2); last; } elsif ($text[$i] eq q{"}) { $text .= "\\" x ($bs_count * 2 + 1); } else { $text .= "\\" x $bs_count; } $text .= $text[$i]; } $text .= q{"}; } return $text; } # direct port of code from win32.c sub _has_shell_metachars { my $string = shift; my $inquote = 0; my $quote = ''; my @string = split '', $string; for my $char (@string) { if ($char eq q{%}) { return 1; } elsif ($char eq q{'} || $char eq q{"}) { if ($inquote) { if ($char eq $quote) { $inquote = 0; $quote = ''; } } else { $quote = $char; $inquote++; } } elsif ($char eq q{<} || $char eq q{>} || $char eq q{|}) { if ( ! $inquote) { return 1; } } } return; }
    Most of this is taken from the article Everyone quotes command line arguments the wrong way. There are probably some things Perl could do better for this.
      Well... Thanks for providing detailed workarounds. But that's exactly my point: the same system() call cannot apparently be used on both platforms, win and unix.

      And that's my problem. I have a piece of perl code which needs to execute an external program, and I want/need to make it portable across both platforms. It is not really satisfactory for me, when I have to introduce additional 30 lines of code for the windows part, just to catch all the possible quirks that can happen to me because of quotes or other special characters that might be contained in the arguments to my external program.

      I had the hope, that the system() call in the argument array form, would do what the documentation says: pass the arguments _diretcly_ to the executable. But thanks to the discussion in this thread, I understand now that it's simply not possible on Windows. And I have to live with that (and obviously implement one or another workaround for this). Thanks again.

Reaped: Re: system() implementation on Windows (again)
by NodeReaper (Curate) on Aug 22, 2011 at 13:47 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://920941]
Approved by moritz
Front-paged by wfsp
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2014-11-22 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (121 votes), past polls