Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid


by root (Scribe)
on Dec 23, 1999 at 00:53 UTC ( #1254=perlfunc: print w/replies, xml ) Need Help??


See the current Perl documentation for Text::ParseWords.

Here is our local, out-dated (pre-5.6) version:

Text::ParseWords - parse text into an array of tokens or array of arrays

  use Text::ParseWords;
  @lists = &nested_quotewords($delim, $keep, @lines);
  @words = &quotewords($delim, $keep, @lines);
  @words = &shellwords(@lines);
  @words = &parse_line($delim, $keep, $line);
  @words = &old_shellw

The &nested_quotewords() and &quotewords() functions accept a delimiter (which can be a regular expression) and a list of lines and then breaks those lines up into a list of words ignoring delimiters that appear inside quotes. &quotewords() returns all of the tokens in a single long list, while &nested_quotewords() returns a list of token lists corresponding to the elements of @lines. &parse_line() does tokenizing on a single string. The &*quotewords() functions simply call &parse_lines(), so if you're only splitting one line you can call &parse_lines() directly and save a function call.

The $keep argument is a boolean flag. If true, then the tokens are split on the specified delimiter, but all other characters (quotes, backslashes, etc.) are kept in the tokens. If $keep is false then the &*quotewords() functions remove all quotes and backslashes that are not themselves backslash-escaped or inside of single quotes (i.e., &quotewords() tries to interpret these characters just like the Bourne shell). NB: these semantics are significantly different from the original version of this module shipped with Perl 5.000 through 5.004. As an additional feature, $keep may be the keyword ``delimiters'' which causes the functions to preserve the delimiters in each string as tokens in the token lists, in addition to preserving quote and backslash characters.

&shellwords() is written as a special case of &quotewords(), and it does token parsing with whitespace as a delimiter-- similar to most Unix shells.


The sample program:

  use Text::ParseWords;
  @words = &quotewords('\s+', 0, q{this   is "a test" of\ quotewords \"for you});
  $i = 0;
  foreach (@words) {
      print "$i: <$_>\n";


  0: <this>
  1: <is>
  2: <a test>
  3: <of quotewords>
  4: <"for>
  5: <you>


  1. a simple word
  2. multiple spaces are skipped because of our $delim
  3. use of quotes to include a space in a word
  4. use of a backslash to include a space in a word
  5. use of a backslash to remove the special meaning of a double-quote
  6. another simple word (note the lack of effect of the backslashed double-quote)

Replacing &quotewords('\s+', 0, q{this is...}) with &shellwords(q{this is...}) is a simpler way to accomplish the same thing.


Maintainer is Hal Pomeranz <>, 1994-1997 (Original author unknown). Much of the code for &parse_line() (including the primary regexp) from Joerk Behrends <>

Examples section another documentation provided by John Heidemann <johnh@ISI.EDU>

Bug reports, patches, and nagging provided by lots of folks-- thanks everybody! Special thanks to Michael Schwern <> for assuring me that a &nested_quotewords() would be useful, and to Jeff Friedl <> for telling me not to worry about error-checking (sort of-- you had to be there).

Log In?

What's my password?
Create A New User
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2017-06-28 09:50 GMT
Find Nodes?
    Voting Booth?
    How many monitors do you use while coding?

    Results (631 votes). Check out past polls.