Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Regexp: can I do it in one go?

by moxliukas (Curate)
on Aug 22, 2002 at 11:13 UTC ( #191980=perlquestion: print w/ replies, xml ) Need Help??
moxliukas has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I have been writing a regexp that would transform this:

$s = 'aaaabababbbbaaaccccbbbbbbaadddd';

into

$s = '4ababa4b3a3a4c6b2a4d';

Basicly it is something similar to mathematical series test (ummm... not sure if this is the correct translation from Lithuanian) where subsequent occurrences of the same character are counted (except no number would be inserted if there is only one character).

I have been trying to come up with a regexp that would do this transformation and I got to the point where everything works:

$s = 'aaaabababbbbaaaccccbbbbbbaadddd'; $s =~ s"($_{2,})"length($1).$_"ge for ('a'..'d'); print $s;

However I am not very happy with the for loop. I wonder if the same can be achieved in one regexp, without the need to scan the line for each character. Can character classes be somehow involved in the regexp to avoid looping?

Thanks for any help in advance.

Comment on Regexp: can I do it in one go?
Select or Download Code
Re: Regexp: can I do it in one go?
by Arien (Pilgrim) on Aug 22, 2002 at 11:31 UTC

    What you want to do is globally match a something including possible repetitions, and replace what you've found with that something followed by the length of your match:

    $s =~ s/((.)\2*)/$2 . length $1/eg;

    — Arien

    Edit: It seems I misread the output you want. To only have sequences of two or more repeated letters replaced, change the star to a plus sign. (And after some sleep...) Also, you'd want to swap length $1 and $2 to have the length preceed the letter.

Re: Regexp: can I do it in one go?
by jmcnamara (Monsignor) on Aug 22, 2002 at 11:34 UTC

    You can use a backreference to obtain a single regex:
    #!/usr/bin/perl -wl use strict; my $s = 'aaaabababbbbaaaccccbbbbbbaadddd'; print $s; $s =~ s/((.)\2+)/length($1) . $2/eg; print $s; __END__ Prints: aaaabababbbbaaaccccbbbbbbaadddd 4ababa4b3a4c6b2a4d

    --
    John.

      Thanks a lot. I can't believe that I didn't think about it this way ;)

      Thank you again

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://191980]
Approved by simon.proctor
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2014-08-29 10:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (280 votes), past polls