http://www.perlmonks.org?node_id=206001

Lars Povlsen has asked for the wisdom of the Perl Monks concerning the following question:

Grrrretings! In a script of mine, I've used
split(/[\s,]+/, $x)
to split input lines consisting of word lists like
aaa bbb cc
aa, bb, cc
Unfortunately, this now breaks running on RH8.0 (perl 5.8.0), producing a "Split loop" error.

I have tracked the error to be provoked with any regexp containing character classes ([]) (and using it with split).

I am somewhat baffled, but if I am doing something wrong, please let me know...

PS: I have a workaround, but I am more interested in why the split is in err...!

Replies are listed 'Best First'.
Re: Split loop error with perl 5.8
by grinder (Bishop) on Oct 17, 2002 at 13:23 UTC

    Can you provide the exact code that produces the error? For exampe, the following code...

    #! /usr/bin/perl -w use strict; my $x1 = 'aaa bbb cc'; my $x2 = 'aa, bb, cc'; print join( '-', split(/[\s,]+/, $x1)), "\n"; print join( '-', split(/[\s,]+/, $x2)), "\n";

    ...produces the following output:...

    aaa-bbb-cc aa-bb-cc

    So you must be doing something else (wrong)?

    Oops, forgot to add, this is with 5.8.0 (albeit not redhat), to whit:

    % perl -v This is perl, v5.8.0 built for i386-freebsd-64int

    Maybe they have applied some funky patches?


    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

      I jsut had the same problem...

      [root@gw /]# uname -a Linux gw 2.4.22 #8 SMP Wed Sep 24 13:53:53 EDT 2003 i686 i686 i386 GNU +/Linux [root@gw /]# perl -v This is perl, v5.8.0 built for i386-linux-thread-multi Copyright 1987-2002, Larry Wall [root@gw /]# cat test #!/usr/bin/perl $file = "/proc/net/snmp"; $ABC = 0; while ( $i ne 0 ) { &GET; $RATE = $FIELDS[3] - $ABC; $ABC = "$FIELDS[3]"; print "$RATE\n"; } sub GET { open (FILE, $file); @LINES = <FILE>; close (FILE); $LINE = $LINES[1]; @FIELDS = split(/ /, $LINE); sleep 1; }

      *NOTE - I had to remove the \s. look like perl 5.8.0 thats shipping with redhat 8.0 does not like splitting on a space using the \s RE

      janitored by ybiC: add balanced <code> tags around code block, as per Monastery convention

Re: Split loop error with perl 5.8
by giulienk (Curate) on Oct 17, 2002 at 13:34 UTC
    It works for me exactly as for grinder. My version:
    julio@debian:~/perl$ perl  -v
    
    This is perl, v5.8.0 built for i386-linux-thread-multi
    


    $|=$_="1g2i1u1l2i4e2n0k",map{print"\7",chop;select$,,$,,$,,$_/7}m{..}g

Re: Split loop error with perl 5.8
by argathin (Acolyte) on Oct 17, 2002 at 14:55 UTC
    This is a very long shot, but here we go anyway: One thing that immediately crossed my mind is the fact that Red Hat changed the default locale to UTF8, I think. I wouldn't know whether this can/should/would have any influence on the problem you mentioned, but maybe it's something to check.

    Cheerio,
    Thomas

Re: Split loop error with perl 5.8
by brianarn (Chaplain) on Oct 18, 2002 at 00:02 UTC
    I'm going to stick up for Red Hat here - I just installed 8.0, and during my config, I couldn't help but read PM. Probably dangerous as root, but hey. Here's a copy and paste, using grinder's example.
    [root@localhost root]# cat split.pl #! /usr/bin/perl -w use strict; my $x1 = 'aaa bbb cc'; my $x2 = 'aa, bb, cc'; print join( '-', split(/[\s,]+/, $x1)), "\n"; print join( '-', split(/[\s,]+/, $x2)), "\n"; [root@localhost root]# ./split.pl aaa-bbb-cc aa-bb-cc [root@localhost root]# perl -v This is perl, v5.8.0 built for i386-linux-thread-multi
    Seems to have worked just fine for me, so now I think it's a context thing, something about how you're using it. :) I'm using the Perl available via RPM, I haven't downloaded a snippet of source yet except for grinder's code. hehe.

    ~Brian
Re: Split loop error with perl 5.8
by Dog and Pony (Priest) on Oct 17, 2002 at 19:07 UTC
    Since I built in anyways today, and to confirm the result of others, I tried grinder's test too, on this:
    This is perl, v5.8.0 built for MSWin32-x86-multi-thread
    - it works just as expected. Sure looks like a RH issue.
    You have moved into a dark place.
    It is pitch black. You are likely to be eaten by a grue.
Re: Split loop error with perl 5.8
by mojotoad (Monsignor) on Oct 11, 2004 at 21:14 UTC
    argathin has it right -- it's Red Hat's use of utf-8 by default. As you've noted, 'use bytes' fixes this within your script.

    Another way around it is to unset the LANG environment variable (but for your script, you'd have to reinvoke with the new ENV setting). This is in case you're relying on external programs, such as the sort command, which sometimes break under the utf-8 setting.

    Cheers,
    Matt

    P.S. By 'reinvoke', I mean something along the lines of this (ugly):

    #!/usr/bin/perl delete $ENV{LANG}; exec($0, @ARGV);
Re: Split loop error with perl 5.8
by damianhammontree (Initiate) on Oct 11, 2004 at 20:46 UTC
    I think argathin is right, here. This has been annoying me for the better part of an hour, and I just added the 'use bytes;' pragma (to get rid of the UTF-8 default, and the "split loop" error magically vanishes. YMMV. :^) -DH
Re: Split loop error with perl 5.8
by ysth (Canon) on Oct 11, 2004 at 22:53 UTC