Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Split a string based on change of character

by eyepopslikeamosquito (Chancellor)
on Jul 28, 2007 at 06:51 UTC ( #629259=perlquestion: print w/ replies, xml ) Need Help??
eyepopslikeamosquito has asked for the wisdom of the Perl Monks concerning the following question:

For a string 'ABBBCC', I want to produce a list ('A', 'BBB', 'CC'). That is, break the string into pieces based on change of character.

My Perl is getting a bit rusty and I found myself struggling today with this simple problem. Though I've found a solution, shown below, there is probably a better way to do it, hence my question. Apologies if this is a FAQ.

use strict; use warnings; my $str = 'AAABBCCCC'; # For str 'AAABBCCCC', I want to produce a list ('AAA', 'BB', 'CCCC'). # This works ... but is there a better way to do it? my $i = 0; # $i is used to filter out the captured $1 fields my @x = grep { ++$i % 2 } split(/(?<=(.))(?!\1)/, $str); for my $e (@x) { print "e='$e'\n" }

Comment on Split a string based on change of character
Select or Download Code
Replies are listed 'Best First'.
Re: Split a string based on change of character (also)
by tye (Cardinal) on Jul 28, 2007 at 07:41 UTC
    my $str= "AAABBCCCC"; my @x; push @x, $1 while $str =~ /((.)\2*)/g;

    - tye        

Re: Split a string based on change of character
by moritz (Cardinal) on Jul 28, 2007 at 08:04 UTC
    I don't know if you can use a split here, because your pattern may not consume characters to achieve what you want.

    I've tried this one: m/((?<=.))(?!\1)/ which should be "a position before which there is a character, and after that a different character", but it doesn't work.

    Can anybody tell me why this doesn't match the string 'aaabbbc'?

    Update: Zoffix told me on IRC that the right thing would be m/(?<=(.))(?!\1)/ (because the assertion is zero-width), however that doesn't work in split as well, because it returns the captured character:

    $ perl -MData::Dumper -wle '$Data::Dumper::Indent=0; $_="aaabbc"; prin +t Dumper([split m/(?<=(.))(?!\1)/]);' $VAR1 = ['aaa','a','bb','b','c','c'];

    This way you had to discard every second item of the returned list - not pretty either ;-)

      You can use split:
      my $string= 'AAABBBCCCDD'; my $i=0; my @words= grep $i=!$i,split /(?<=(.))(?!\1)/,$string; print join "\n",@words,'';

      s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
      +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
        I never thought of
        grep $i=!$i,
        before. I usually use
        grep $i^=$i,

        Yours is easier to understand, though.

        Update: Yeah, I mean grep $i^=1,.

      <Zoffix> eval: $_='zyxxaabbbcccccc'; push @a, $1 while s/((.)\2*)//;[@ +a] <_ZofBot> Zoffix: ['z','y','xx','aa','bbb','cccccc']

      20070730 Janitored by Corion: Added code tags, as per Writeup Formatting Tips

        Or even: $_='zyxxaabbbcccccc'; push @a, $1 while /((.)\2*)/g; Which doesn't destroy original data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://629259]
Approved by FunkyMonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (11)
As of 2015-07-29 18:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (267 votes), past polls