Don't forget that you need to protect some special characters from shell interpretation in the general case when you cannot guarantee your file naming at exactly 100%. This includes files created by nice little programming or commandline errors starting with simple oversights such as a command creating "-" instead of understanding it as the intended /dev/stdout alias within a pipe.
The special characters to protect include quotes, dollar signs or whitespace and semicolons. Just placing double quotes around a string only takes care of whitespace, semicolons and single quotes.
$file='a" b ; echo "$c '; $output=`cp "$file" newfile 2>&1`; # will fail despite of double-quote quoting
You can of course, ignore this issue. Which is safe to do, IFF you have 100% control of your filenames and know that they contain spaces at worst: You trust all your code, all the code is error-free, there's no bitrot in DRAM or disk, etc. pp. But in reality none of these preconditions is valid, so we better have quality & well-tested backups. Or we just use ...
Some reasonably safe work-arounds on Unix (and most of cygwin) are, in my subjective order of preference:
- place the files in shell variables using %ENV:
$output=`cp "\$f1" "\$f2" 2>&1`;
- use the list form of exec/system to run the command directly
without going through the shell:
For `` aka qx!!, use the 4+ argument form of open(FH,"-|","cmd","arg1",...).
- if unavoidable: quote or escape offending
characters for use in either bare-word, single or double quoted shell string.
First check cpan, as there might be an os- and shell-agnnostic
module for this. If you need to do it yourself, consider e.g.
s/["`\\\$]/\\$&/g (escaping for double quotes) or
s/'/'"'"'/g (shell string concatenation for single quotes; this is more
or less what the String::ShellQuote does, just w/o adding single-quotes
around the string. Another example module using such regexes ARGV::readonly
for treating the secure filename issue when using the
magic-and-intentionally--broken <> - both of which
work on Unix/Bourne-Shell, and both of which do not seem to be very OS-agnostic to
- The above is valid for Unix and most of cygwin. In windows, you also need to account for whatever quoting and escaping cmd.exe supports.
- In all 3 cases, the variables are expanded by Perl. Case 1 does
variable expansion in the invoked shell (seeing e.g. "$f1" in the example).
Case 3 applies the full set of shell interpolations (thus the need for
to escape all characters that might be special to the shell).
Case 2 bypasses the shell and invokes the command directly.
- You may also need to prefix a leading - with ./, otherwise a command may
interpret the filename as a command option. If the command supports a special option
of -- as end-of-options option, use that instead.
- quoting-s/hell: The line-noise generated in case 3 is what you normally see in shell scripts.
Not to mention magic nested double-quote shell quoting as in echo "$(echo "$i")" or echo "`echo "1 + 2" 3; echo x`". Perl's ability to switch quoting delimeters with q and qq offers a welcome escape to readability and sanity.
Albeit with one little ...
- ... Perlish Caveat: You also need to use shell-safe quoting for some 2- and 3-argument forms of open (e.g. open(FH,"-|",$cmd); open(FH,$cmd . "|")). This also affects the 2-argument open implied by the (implied) magic <> using @ARGV as filenames (consider this unsafe clone of the cat command: perl -0ne: *)
Related issues to keep in mind:
- ANSI or Terminal Control sequences / breaking line-based reporting:
The securely quoted/escaped filenames still make a mess when used e.g. in
die("cannot open $file\n") in two regards: 1) theyr're not really readable & you cannot paste them as-is due to quoting/escaping. 2) filenames (both raw & secured) are not safe for direct use on stdout, stderr and /dev/tty (consider esc sequences to break both reporting format and your xterm/vt100 terminal settings; depending on the available terminal capabilities, this ranges from more-or-less killing the terminal, exploitable character-injection-into-input-stream, to command execution at worst). Logging might also profit from filtering at least newlines embedded in filenames (replace with e.g. '?'). So you better use a second regex/function to smash non-printable chars in e.g. error reporting, die, carp, ...
Some of the App:: or Logging:: modules & frameworks might offer some support for these concerns as well. If so: Which of these do you trust and suggest for both security as well as cutting down on boilerplate code?