Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: How can I count characters between two same substrings in a string where the substring is repeated more than 5 times?

by tobyink (Canon)
on Dec 09, 2011 at 08:15 UTC ( [id://942593]=note: print w/replies, xml ) Need Help??

in reply to How can I count characters between two same substrings in a string where the substring is repeated more than 5 times?

Easiest way would be to split the string, and use length on each portion:

use 5.010; my $string = "abcFOOdefghFOOiFOOjklmFOOnopqrFOOstuvFOOwxyz"; say "Lengths are:"; say $_ for map { sprintf("%d ('%s')", length, $_) } split /FOO/, $stri +ng;
  • Comment on Re: How can I count characters between two same substrings in a string where the substring is repeated more than 5 times?
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: How can I count characters between two same substrings in a string where the substring is repeated more than 5 times?
by ww (Archbishop) on Dec 09, 2011 at 11:19 UTC
    Elegant, but deviates from OP's spec, in that the character-count before the first "FOO" (or "cat") and after the last are reported in the output.

    At risk of being bitten again by my blindspots, I see no way to force split alone (or even with a LIMIT) to do the job by itself. OTOH, adding a simple regex to remove the chars before the first delimiter and after the last would work nicely.

      I'd use split. One of the points of the OP's spec is "repeated more than 5 times". A lot easier to check with split, than the cram it all in a regexp.


      my @chunks = split /PAT/, $str; shift @chunks unless $str =~ /^PAT/; pop @chunks unless $str =~ /PAT$/; if (@chunks > 4) { say "Lengths: @{[map {length} @chunks]}"; }

        Since split discards an empty trailing string (but not a leading one), one option would be to add a character to the end that can't be part of the delimiter. Then you can split and take everything except the first and last elements:

        #!/usr/bin/perl use Modern::Perl; my $a = 'catonecattwocatthreecatfourcat'; $a .= ' '; my @w = split /cat/, $a; say "$_ -> " . length($_) for @w[1..@w-2]; one -> 3 two -> 3 three -> 5 four -> 4

        Aaron B.
        My Woefully Neglected Blog, where I occasionally mention Perl.

      Yes, I deviated deliberately for simplicity. If you want to ignore the leading and trailing sections, just shift the first element off the list you get back from split, then pop the last element off as well.
        That's all well and good, tobyink, but coding your proposed solution increases the complexity far beyond anything readily identifiable as "simple."

        To avoid complexity, one must know in advance that the delimiter follows something else at the start of the string (or is followed by something at the end)... in which case, counting with a pencil, paper and strike-thrus may be just as effective...

        If the topology of the string is unknown, then you must deal with the permutations...

        • Delimiter first in the string, and last
        • Delimiter NOT first in string, but is last
        • Delimiter first in string, but NOT last
        • Delimiter neither first nor last
        • Delimiter occurs an odd number of times in string where it is either first or last
        • ... (I'm sure I've missed some)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://942593]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (6)
As of 2024-04-26 09:03 GMT
Find Nodes?
    Voting Booth?

    No recent polls found