http://www.perlmonks.org?node_id=11118420

bojinlund has asked for the wisdom of the Perl Monks concerning the following question:

use strict; use warnings; use 5.010; # from documentation: # ... and each of these: print join(':', split(//, 'abc', 3)), "\n"; print join(':', split(//, 'abc', 4)), "\n"; # produces the output a:b:c . say $^V; say $^O; __DATA__ output: a:b:c a:b:c: v5.30.1

In the second call there is an extra empty string. I am using Windows 10.

Replies are listed 'Best First'.
Re: Function Split, bug or error in the documentation?
by haukex (Archbishop) on Jun 24, 2020 at 13:08 UTC

    AFAICT with a quick check, Perl has always behaved as your code shows. It appears this documentation bug was introduced with a rewrite of the split manpage for 5.16 in bd46758. The previous version did not have this issue, it said:

    Empty trailing fields, on the other hand, are produced when there is a match at the end of the string (and when LIMIT is given and is not 0), regardless of the length of the match. For example:
    print join(':', split(//, 'hi there!', -1)), "\n"; print join(':', split(/\W/, 'hi there!', -1)), "\n";
    produce the output 'h:i: :t:h:e:r:e:!:' and 'hi:there:', respectively, both with an empty trailing field.

    You should definitely open a bug report against perlfunc.pod.

Re: Function Split, bug or error in the documentation?
by soonix (Canon) on Jun 24, 2020 at 12:54 UTC
    Same here, (v5.26.1-1 linux). I assume it is documented in this part in split:
    An empty trailing field, on the other hand, is produced when there is a match at the end of EXPR, regardless of the length of the match (of course, unless a non-zero LIMIT is given explicitly, such fields are removed, as in the last example).
Re: Function Split, bug or error in the documentation?
by davies (Prior) on Jun 24, 2020 at 15:30 UTC
Re: Function Split, bug or error in the documentation?
by bojinlund (Monsignor) on Jul 04, 2020 at 11:15 UTC

    I am trying to understand split and its documentation. To do this I have implement split by using m{} and the variables @- and @+. By this I have found some problems

    This script shows some of them:

    use strict; use warnings; use 5.010; use Data::Dump qw(dump dd ddx); sub split_e { } split_e( // ); # Warning: Use of uninitialized value $_ in pattern match (m//) at pm_ +1.pl ... my @rv = split( //, '' ); # Warning: none my $str = '1-10,20'; my $pat = '(-)|(,)'; @rv = split( $pat, $str ); warn dump @rv; # (1, "-", undef, 10, undef , ",", 20) @rv = $str =~ m{$pat}g; warn dump @rv; # ("-", undef, undef, ",") while ( my $rv = $str =~ m{$pat}gc ) { for my $ix ( 0 .. 99 ) { if ( defined $-[$ix] ) { say sprintf '$ix= %d (%d,%d)<%s>', $ix, $-[$ix], $+[$ix], substr $str, $-[$ix], $+[$ix] - $-[$ix]; } } } say "Strawberry Perl $^V"; say $^O; __DATA__ output: $ix= 0 (1,2)<-> $ix= 1 (1,2)<-> $ix= 0 (4,5)<,> $ix= 2 (4,5)<,> Strawberry Perl v5.30.1 MSWin32

    Questions

    Magic in split

    This split_e( // ); gives the warning: Use of uninitialized value $_ in pattern match (m//) at pm_1.pl ... but this split( // ); does not.

    Is there some magic in split indicate by the ‘/’s in “/PATTERN/ in the documentation?

    Bug in the variables @- and @+

    I find that the result from split, m{} and when using $-[$ix] $+[$ix] are inconsistent.

    Split gives (1, "-", undef, 10, undef , ",", 20).

    m{} gives ("-", undef, undef, ",").

    Using $-[$ix] $+[$ix] does not include the undefs

    Is this a bug in the last case?

      split_e( // ), short for split_e( $_ =~ m// ), performs a match operation. You want split_e( qr// ).

      split is an operator. And like all operators, it has full control over the syntax of its operands. split /.../, ... is functionally equivalent to split qr/.../, .... Keep in mind that split predates qr//.