Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

At work, we currently run a motley mix of Perl, Unix shell and Windows .BAT scripts. I'm putting together a case that Perl should almost always be preferred to Unix shell (and DOS batch). Here's why.

Writing portable shell scripts is hard. Very hard. After all, you must run a variety of external commands to get the job done. Is each external command available on each of our platforms? And does it support the same command line options? Is the behaviour the same on each platform? (echo, awk/nawk/mawk/gawk, grep/egrep, find/xargs, and ps are some example commands that spring to mind whose behaviour varies between different Unix flavours). Many of these portability concerns disappear when you write a Perl script because most of the work is done, not by external commands, but by Perl internal functions and modules. It's easier to port a shell than a shell script. :-) Note that we port the same Perl version to all our Unix and Windows boxes, and so don't experience any nasty Perl version portability issues.

Shell scripts tend to be fragile. Running commands and scraping their output is an inherently fragile technique, breaking when the format of the command output changes. Moreover, unlike Perl, shell scripts are not compiled when the script loads, so syntax errors may well be lurking down unexercised code paths: Perl catches syntax errors at compile time; shell catches them at run time. Notice that syntax errors can appear later when the script is presented with different data: test $fred -eq 42, for example, fails with a syntax error at run time if $fred contains whitespace or is empty (BTW, to avoid that, this code should be written as test "$fred" -eq 42). Finally, shell scripts can easily break if the global environment changes, especially the PATH environment variable. Even if the value of PATH itself doesn't change, a change to any file on the PATH has the potential to break a shell script (I remember one such case where GNU tar was mistakenly installed in a directory ahead of the system tar on the PATH on one of our AIX boxes).

Shell scripts, being interpreted and often having to create new processes to run external commands to do work, tend to be slow.

Shell scripts tend to be insecure. Running an external command in a global environment is inherently less secure than calling an internal function. Which is why most Unices do not allow a shell script to be setuid. And shell doesn't have an equivalent of Perl's taint mode, to handle untrusted, potentially malicious, script data.

Error handling and reporting tends to be more robust in Perl. For example, $! set by a failing Perl built-in function provides more reliable and specific error detail than $? set by a failing external command (especially when the external command is undisciplined in setting $? and in writing useful, detailed and regular error messages to stderr). Moreover, the ease of using the common Perl "or die" idiom and the newer autodie pragma tends to make Perl scripts more robust in practice; for example, chdir($mydir) or die "error: chdir '$mydir': $!" is commonly seen in Perl scripts, yet I've rarely seen shell scripts similarly check cd somedir and fail fast if the cd fails.

As a programming language, shell is primitive. Shell does not have namespaces, modules, objects, inheritance, exception handling, complex data structures and other language facilities and tools (e.g. Perl::Critic, Perl::Tidy, Devel::Cover, Devel::NYTProf, Pod::Coverage, ...) needed to support programming in the large.

Finally, shell has fewer reusable libraries available. Shell has nothing comparable to Perl's CPAN.

Even if the script is small, don't write it in shell unless you're really sure that it will remain small. Small scripts have a way of growing into larger ones, and you don't want the unproductive chore of converting thousands of lines of working shell script to Perl, risking breaking a functioning system in the process. Avoid that future pain by writing it in Perl to begin with.

Notice that the above is arguing against Unix shell scripts, not specifically for Perl. Indeed, the same essential arguments apply equally to Python and Ruby as they do to Perl. However, I view Perl, Python and Ruby as essentially equivalent and, given our existing significant investment in Perl, don't see a strong business case for switching languages. I see an even weaker business case for using more than one of Perl/Python/Ruby because that dilutes the company's already overstretched skill base.

I'm further trying to come up with a checklist of when it is ok for work folks to write their scripts in Unix shell:

  • If the script must run in a Unix environment in which Perl is not available.
  • If you "know" the script will always remain very simple and short, less than around 20 lines of code.
  • If process startup time is significant. Note that shell has a lower startup cost than Perl.
  • If you're sure the script will "never" need to be ported to Windows.

I might add that I personally enjoy Unix shell scripting and have written a lot of Unix shell scripts over the years. It's just that I feel the argument for Perl over shell is overwhelming. Feedback on the above is welcome.


References Added Later

Updated 23-feb-2008: Improved wording. Mentioned Perl's taint mode. Clarified Perl (compiled) v shell (interpreted). 26-feb-2008: Added References section. 30-sep-2008: Added libraries (CPAN) paragraph. 4-oct-2010: Added error handling paragraph. 2017: Added "References Added Later" section.

In reply to Unix shell versus Perl by eyepopslikeamosquito

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others rifling through the Monastery: (6)
    As of 2018-06-24 20:50 GMT
    Find Nodes?
      Voting Booth?
      Should cpanminus be part of the standard Perl release?

      Results (126 votes). Check out past polls.