http://www.perlmonks.org?node_id=115547

r.joseph has asked for the wisdom of the Perl Monks concerning the following question:

Suppose the string:
$a = '059057034037063107105003046039039036107063035046';
and I want to split every 3 digits to get an array that contains 059, 057, 034, .... Shouldn't this:
my @nums = split(/(\d{3})/,$a);
Work just fine? For some reason, it defintely doesn't! I can't figure why...Thanks for the help!

r. j o s e p h
"Violence is a last resort of the incompetent" - Salvor Hardin, Foundation by Issac Asimov

Replies are listed 'Best First'.
Re: Splitting every 3 digits?
by gbarr (Monk) on Sep 29, 2001 at 04:32 UTC
    Split will return the text between each match, and if the pattern contains parens, as yours does, it will also return the contents of the parens. So you code will return a list where every other element is a group of three digits and the other element will be empty strings. Except at the end, if the input string is not a multiple of three charaters, then you will see something different.

    What you want is

    my @nums = $a =~ /\d{3}/g;

    But with this, if the input string is not a multiple of three characters then the last one or two would be lost. If you want the l;ast element of @nums to contain one, two or three digits so that all of the digits in the input appear in @nums then use

    my @nums = $a =~ /\d{1,3}/g;

Re: Splitting every 3 digits?
by wog (Curate) on Sep 29, 2001 at 04:31 UTC
    The pattern that you give to split is supposed to match what goes between each item you want to extract. Because there's no good way of matching what goes between each thing you want to extract from that string, split is probably not a good choice here. One straightforward way of doing this would be to use substr: (update: erroneous >= removed.)

    my @nums; push @nums, substr($a, 0, 3, "") while length($a);

    Note that this does not check if you are only extracting digits, but I assume that you don't expect non-digits in your case (update: if you do, you can use the regex below with \d, or check to make sure the string has no non-digits before hand) You can also use a regular expresion with the /g modifier (to make it, in list context, return all the captured text (which will be everything the regex matches if there is no explict capturing parenthesis) for all matches found within the string) to do this task:

    my @nums = $a =~ /(...)/g

    (Note that split will return things in grouping parenthesis in regex provided to it in addition to the things between the what the regex matches, which will explain what your code puts in @nums.)

    (update: fixed phrasing to talk about not expecting non-digits, rather then the... horror I had before.)

Re: Splitting every 3 digits?
by hopes (Friar) on Sep 29, 2001 at 04:27 UTC
    You can use 'g' modifier to the regexp, and then it could returns an array
    my $a = '059057034037063107105003046039039036107063035046'; my @nums = split /(\d{3})/g,$a;

    Hope this helps
    Hopes
    Update Yes, I didn't notice it.
    My first script returns empty values too because of the split.

    If you want to use split, you can try:
    my @nums = grep $_,split /(\d{3})/g,$a;
    but is better without split
    my @nums = $a=~/\d{3}/g;
    Regards
      How about this?
      my $var="1112223334445556667"; $var=~s/(\d{3})/$1\_/g; my @nums=split(/_/,$var);
Re: Splitting every 3 digits?
by stefp (Vicar) on Sep 29, 2001 at 05:34 UTC
    A related problem that may well be the underlying motivation of the question: writing an integer by block of three digits separated by underscores. The custom is to start from the end because the goal is to make visible thousands, millions and so on. The easy way:
    $_=5544443333111; $_=reverse; s/(\d{3})/$1_/g; $_=reverse; s/^_//; print;

    -- stefp

Re: Splitting every 3 digits?
by t0j0 (Novice) on Sep 30, 2001 at 19:31 UTC
    how about this
    my @nums = (); my $a = '1234567890'; while ($a =~ /(\d{1,3})/g) { push(@nums,$1); }
Re: Splitting every 3 digits?
by sifukurt (Hermit) on Oct 01, 2001 at 17:21 UTC
    The only thing you'll have to be wary of with some of the previous examples is if your string, $a, contains a number of digits that isn't evenly divisible by 3, your array won't contain the last 1 or 2 digits. If that is what you want, then you needn't worry about this. If, however, you want the last element in your array to contain the last digits even if the length of your string isn't evenly divisible by 3, you'll want to do something like this:
    $a = '12345678901'; while ( $a =~ /(\d{3})/g ) { push ( @nums, $1 ); $last = $'; } if ( length($a) % 3 ) { push ( @nums, $last ); }
    In the above example, @nums will contain:
    123
    456
    789
    01

    There may be a more elegant way of doing this, but it works. Hope this helps.
    ___________________
    Kurt
Re: Splitting every 3 digits?
by broquaint (Abbot) on Oct 01, 2001 at 17:54 UTC
    Why split, when you can match
    my @nums = ("123456789" =~ /(\d{3})/g); print "$_\n" for @nums; output - 123 456 789
    I've found this to be yet another handy feature of regexp matching, and great for the likes of map and foreach.
    HTH

    broquaint

Re: Splitting every 3 digits?
by Sigmund (Pilgrim) on Oct 02, 2001 at 13:22 UTC
    Hi,
    Why don't you just unpack???

    #!/usr/bin/perl $a = '059057034037063107105003046039039036107063035046'; $lag = "a3" x ((length $a)/3); @nums = unpack $lag, $a; foreach(@nums) { print; print "\n"; }


    SiG

    perl -le 's ssSss.s sSsSiss.s s$sSss.s s.$s\107ss.print'