http://www.perlmonks.org?node_id=287545


in reply to Tutorial suggestion: split and join

It is not only the intermediate elements. The effect of a delimiter are felt on the empty elements at the beginning and the end of the source string.

Consider the following examples

#!/usr/bin/perl -w use strict; my $line = " Bart Lisa Maggie Marge Homer "; # notice the leading and trailing spaces my @simpsons; for ( " ", '\s', '\s+' ) { print "delimiter /$_/\n"; @simpsons = split ( /$_/, $line ); print map {"<$_>"} @simpsons; print $/; } print "delimiter ' '\n"; @simpsons = split ( ' ', $line ); print map {"<$_>"} @simpsons; print $/; __END__ delimiter / / <><><Bart><><Lisa><Maggie><Marge><Homer> delimiter /\s/ <><><Bart><><Lisa><Maggie><Marge><Homer> delimiter /\s+/ <><Bart><Lisa><Maggie><Marge><Homer> delimiter ' ' <Bart><Lisa><Maggie><Marge><Homer>

The best choice if you want to split a string by spaces and you don't want the empty elements is to use a simple quoted space (not a regex) as a delimiter, as the last example shows.

From perldoc -f split

As a special case, specifying a PATTERN of space ("' '") will split on white space just as "split" with no arguments does. Thus, "split(' ')" can be used to emulate awk's default behavior, whereas "split(/ /)" will give you as many null initial fields as there are leading spaces. A "split" on "/\s+/" is like a "split(' ')" except that any leading whitespace produces a null first field. A "split" with no arguments really does a "split(' ', $_)" internally.

Update If you want to document the above behavior, you can use B::Deparse.

perl -MO=Deparse -e '$_=" a b c ";print map {"<$_>"} split' $_ = ' a b c '; print map({"<$_>";} split(" ", $_, 0));

However, this will work in Perl 5.8.0 but not in 5.6.1. (in 5.6.1 the output of the one-liner is correct, but the deparsed code is not). Apparently, there was a bug that was recently fixed. Thanks to diotalevi for his useful analysis in this matter.

 _  _ _  _  
(_|| | |(_|><
 _|