$ perl -le 'print join "|", split /(\W)/, "Hello, World! ?";'
Hello|,|| |World|!|| ||?
$ perl -le 'print join "|", split /\W/, "Hello, World! ?";'
Hello||World
| [reply] [d/l] |
Think of it like a really simple CSV file*. Your separator character is \W. To make it easier to think about, replace all \W with a comma, and your input string is "Hello,,World,,,".
split /,/, "Hello,,World,,," gives you the list "Hello", "", "World" (trailing empty fields are stripped as documented).
What you're asking split to do when you say split /(,)/, "Hello,,World,,," is keep the separator character in the list of return values (also the empty fields between separators aren't stripped). Hence:
$ perl -e 'print join("|",split(/(\W)/,"Hello,,World,,,")),"\n";'
Hello|,||,|World|,||,||,
* Just for the sake of discussion, we all know we should be using Text::CSV instead of split ;-) | [reply] [d/l] [select] |
Is there an explanation, why split() behaves like this: Yes, because its documented that split behaves that way, split splits strings apart into pieces (split cuts), even if there is nothing in between
Read perldoc -f split and consider this
use Data::Dump qw/ dd /;
dd( split /\D/, q/12Q34/ );
dd( split /\D/, q/12ab34/ );
dd( split /(\D)/, q/12ab34/ );
__END__
(12, 34)
(12, "", 34)
(12, "a", "", "b", 34)
Q is not a digit between 12 and 34
empty string "" is not a digit between a and b
empty string "" is not a digit between a and b (a and b are preserved not discarded
split cuts a string apart, discarding the cut pieces unless you (keep) them | [reply] [d/l] |
What is your expected output? Is it what you're showing in your second and third example?
Regarding grep {$_} ...: that will also filter out the string "0", since that's false in Perl. You may find grep {length} ... better.
However, personally, I like your third example best, because I feel like it expresses best what you want the output to be, but of course There's More Than One Way To Do It :-)
| [reply] [d/l] [select] |