Split on every second character

by gri6507 (Deacon)
on Feb 12, 2010 at 22:56 UTC
gri6507 has asked for the wisdom of the Perl Monks concerning the following question:

I have what I thought would be a very simple problem. However, it has me baffled. I have data of the form
which I would like to split into the form of
"01 02 03 04 05 06"
. The code that I wrote

use strict; use warnings; use Data::Dumper; my @list = split(/(..)/, '010203040506'); print Dumper(\@list); print join(' ', @list)

does the job, but produces empty elements in the list, which results in the undesired output of
" 00  01  02  03  04  05  06"
which contains unwanted spaces. I understand why this is: when splitting on any two characters, the first two are a match, thus, thus they represent the split string (which is element 1), which separates a NULL string (captured in element 0) from the remaining characters (captured in remaining characters). However, I don't want to see the extra spaces in the resulting output. How can I get rid of that?

Re: Split on every second character
by BrowserUk (Pope) on Feb 12, 2010

    For anything that doesn't require conditional matching, I prefer unpack:

    print for unpack '(A2)*', '010203040506';; 01 02 03 04 05 06

Re: Split on every second character
by linuxer (Curate) on Feb 12, 2010

    If you are sure that the string is even sized, you can use a simple regex like this:

    use strict; use warnings; my $string = "0102030405"; # will miss the last character when string is odd sized my @elements = $string =~ m/(..)/g; print "@elements\n";
    Update: modified code

      and if you are unsure but want the last character for odd length strings you can:

      my @list = $str =~ /(..?)/g;

      I didn't think of using the match operator for this. Thanks!
Re: Split on every second character
by ikegami (Pope) on Feb 12, 2010

    When you use split, the pattern must match what separates what you want. In this case, the separator is the empty string between a character at an odd positions and a character at an even position. That's not exactly straightforward to match, but it's possible.

    $ perl -E'say for split /(?!^|\z)(?(?{ pos()%2 })(?!))/, "0102030405"' 01 02 03 04 05 06

    Here, it's simpler just to match what you want grab rather than what separates them, so just use a m//g:

    $ perl -E'say for "0102030405" =~ /(..?)/sg' 01 02 03 04 05 06
      No wonder getting the split() operator to work correctly was difficult. Thank you!
Re: Split on every second character
by rubasov (Friar) on Feb 13, 2010
    Or still using split, you can just drop every second element from the result list:
    $ perl -le '@l = grep {$i++ % 2} split /(..)/, "010203040506"; print j +oin "|", @l'
    But the solutions provided by other monks are definitely nicer.

      I agree that some variation of an approach using  m/..?/g or  unpack() is better, but a more concise  split() solution would be:

      >perl -wMstrict -le "print for grep length, split /(..)/, '0001020304101112';" 00 01 02 03 04 10 11 12

