Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Odd workings of split

by Ralesk (Pilgrim)
on Mar 29, 2012 at 12:33 UTC ( #962366=perlquestion: print w/replies, xml ) Need Help??
Ralesk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Iíve run into a strange little issue. Communicating with another piece of software (actually, Linphone), the command protocol is line based. To make sure there are no newlines in any of the commands, I decided to split and take the first (0th) element and just go with it, discarding the rest ó instead of using a regex to do it.

Later on, I figured Iíd allow more arguments to the method, joining them with spaces, so I can just pass a list when calling it (and so I donít have to concatenate variables into the command string all the time).

Long story short, hereís some code:

sub cmd($@) { my $self = shift; my @cmd = @_; ## Make sure none of them contain newlines for (0 .. $#cmd) { $cmd[$_] = [split "\n", $cmd[$_]]->[0]; } ## Join words with spaces and send them to linphonec print {$self->{Writer}} join(' ', @cmd) . "\n"; }

Calling it with things like $linphone->cmd("command"), $linphone->cmd("legacy command string I forgot to update\n") or $linphone->cmd("command", $argument1, $argument2) appears to work fine, as split shall return whatever is before the newline and all arguments are happy.

Calling it with $linphone->cmd('') however returns undef, which I feel is a very nasty thing for split to do.

Not sure if I actually have questions, this is probably a known caveat of split, but of course if thereís a way to coerce it to do the right thing, Iím all ears. Until then, I'll just // '' things or something like that.

Replies are listed 'Best First'.
Re: Odd workings of split
by moritz (Cardinal) on Mar 29, 2012 at 12:49 UTC

    A tiny correction: split doesn't return undef, it returns the empty list. Accessing [ (empty list) ]->[0] is what produces the undef.

    The documentation of split also mentions this case:

    Note that splitting an EXPR that evaluates to the empty string always returns the empty list, regardless of the LIMIT specified.

    The problem is really that split is used for too many things, and it tries to cater to all needs.

    A different way to approach this is not to ask what to split on, but ask what you want to extract. For example you can write

    $cmd[0] = ($cmd[0] =~ /.*/g)[0]

    where .* matches every character up to (but excluding) the first newline.

    Or explicitly state what you want removed:

    $cmd[0] =~ s/\n.*/s;

    In Perl 6, split is less magical, and you are encouraged to write patterns that match what you're after (and not patterns that match the separator, so that still works):


    I'm not sure what else to write, since you didn't really have a question :-)

      A tiny correction: split doesn't return undef, it returns the empty list. Accessing [ (empty list) ]->[0] is what produces the undef.

      Ahhhhhhhhhhhhhh, that makes sense! Thanks!

Re: Odd workings of split
by tobyink (Abbot) on Mar 29, 2012 at 13:42 UTC

    Your question has already been answered, but I'd like to point out that this:

    $cmd[$_] = [split "\n", $cmd[$_]]->[0];

    Might be better as:

    $cmd[$_] = [split "\n", $cmd[$_], 2]->[0];

    Setting the third parameter to "2" indicates that you don't want split to return any more than 2 items. So if your string is, say, 8 lines long, split will return a list containing two strings: the first line, then all the other lines. This saves Perl a bit of work splitting up a portion of the string that you're uninterested in.

    Alternatively, if you use list assignment, like this:

    ($cmd[$_]) = split "\n", $cmd[$_];

    ... then Perl is actually smart enough to figure out the ",2" itself. Take a peek at:

    perl -MO=Deparse -e'($cmd[$_]) = split "\n", $cmd[$_];'
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      Yeah, had the limit argument in mind, but forgot to type it in. Good call.

      And itís nice to know Perl is smart enough to optimise things out like that!

Re: Odd workings of split
by morgon (Curate) on Mar 29, 2012 at 12:50 UTC
    From "perldoc -f split":
    Note that splitting an EXPR that evaluates to the empty string always returns the empty list.
    So it's not split that returns an undef, but the first element of the empty list that split returns (which is what you want to use) is undef.

    I guess if you want to pass on an empty string (as opposed to undef) in such cases you would have to add a clause such as

    $cmd[$_] //= "";
Re: Odd workings of split
by JavaFan (Canon) on Mar 29, 2012 at 12:58 UTC
    Another work around, specially geared towards the oddity that split returns an empty list:
    $cmd[$_] = [split("\n", $cmd[$_]), ""]->[0];
Re: Odd workings of split
by Jenda (Abbot) on Mar 30, 2012 at 09:30 UTC

    One more, kinda unrelated, point. In Perl, unlike say in C#, the foreach loop variable is an alias to the list element and MAY be modified. So there is no reason to bother with the indices:

    for (@cmd) { $_ = [split "\n", $_]->[0]; }

    Next thing ... you don't have to create an anonymous reference and then dereference it to get the first element from a list returned by a subroutine. A pair of braces is enough:

    for (@cmd) { $_ = (split "\n", $_)[0]; }

    And last there really is no reason to avoid regexps. They tend to make the code simpler.

    for (@cmd) { s/\n.*$//s; }

    Enoch was right!
    Enjoy the last years of Rome.

      Nice summary, thanks :)

      Re your 2nd code snippet, I swear Iíve run into Perl (5.10) complaining about )[ before, but I have no idea anymore what I did to make it not like it. Your code does work thoughÖ Much to my surprise really.

        The important thing is that the () is abound the whole subroutine call, not just around the parameters. split("\n", $cmd[$_])[0] would not work.

        Enoch was right!
        Enjoy the last years of Rome.

Re: Odd workings of split
by Anonymous Monk on Mar 29, 2012 at 13:03 UTC
    It seems odd that, if what you want to do is to "remove newlines," that you do not use the obvious regex to do that.   If you want to "take the command up to the first newline if any," split will work fine, in a short subroutine, with an if-statement to supply an empty-string for undef.   Two (gasp!) lines of code ... problem solved.

      Sometimes Iím in a regexy mood and would do a lot of things with regexes ó I was feeling oddly anti-regex when I wrote this and hence my preference to splitís 0th result. Itís also anÖ obvious way, I guess :)

Re: Odd workings of split
by flexvault (Monsignor) on Mar 29, 2012 at 12:48 UTC


    Some things jump out, but take a look at 'quotemeta'. It might help!

    "Well done is better than well said." - Benjamin Franklin

      Care to explain how quotemeta is going to help? Or are you just sprouting the first thing that comes in mind, to have a first post?


        Originally, I did look at the post quickly, and you are correct that 'quotemeta' would not help. Since there was no output shown, I thought incorrectly that maybe the code was getting into a regex problem with special characters.

        After your response, I took the time to download the code onto a Samba share, login to that machine, and then ftp the 12 lines of code to the machine that I was working on, so that I could see what was going on. I added some code to call the sub and then realized he was calling the sub with null('') but it still worked for me( returned null('') ). I couldn't get the 'split' to return 'undef' but that could be OS or Perl version related.

        But I didn't notice I was first, which would be very rare for me anyway!

        So is it somehow better to be first?

        "Well done is better than well said." - Benjamin Franklin

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://962366]
Approved by marto
Front-paged by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2018-03-22 01:16 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (272 votes). Check out past polls.