http://www.perlmonks.org?node_id=1015172

newbie1991 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to separate a repeated substring from a really long string. I cannot use the offset numbers because they are all variable length. I want to split into substrings using particular delimiters, for example, (code given below), I want the parts from > to the first \n after the > in one string, and \n to the next > in another string. Using split seems way too cumbersome and would need too many statements. Substr function requires length or offset, but they are variable. Is there a shorter way?

"> abcd1234 abcd abcd >xyz123 xyz"

Replies are listed 'Best First'.
Re: Separating substrings from a main string
by blue_cowdawg (Monsignor) on Jan 24, 2013 at 14:19 UTC

    my $string=qq(> abcd1234 abcd abcd >xyz123 xyz); $string =~s/[\s\n]+//g; #squash out the whitespace and newlines my @f=split(/[\>]/,$string; # CAVEAT: just wrote the off the top of my head...
    Give that a whirl.


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg

      this works, but only for the first instance. As in, @f is >abcd1234, but the correct output should be >abcd1234 >xyz123 Would it be more convenient if $string was an array?

        I gave you a template. If you have other requirements please engage your creativity and modify the template to suit you. Here's one way:

        $ cat splitter.pl use strict; use Data::Dumper; my $foo=qq(> abcd1234 abcd abcd >xyz123 xyz); $foo =~ s/\n//g; my @f = split(/[\>]/,$foo); $_ = '>' . $_ foreach @f; print Dumper(\@f); $ perl splitter.pl $VAR1 = [ '>', '> abcd1234abcd abcd ', '>xyz123 xyz' ];


        Peter L. Berghold -- Unix Professional
        Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Separating substrings from a main string
by Athanasius (Archbishop) on Jan 24, 2013 at 14:25 UTC

    Another approach (using a regex, and assuming you want to keep the whitespace):

    #! perl use strict; use warnings; my $string = <<END; > abcd1234 abcd abcd >xyz123 xyz END if ($string =~ />(.*?)\n(.*?)>/s) { print "First substring is '$1'\n"; print "Second substring is '$2'\n"; }

    Output:

    0:23 >perl 497_SoPW.pl First substring is ' abcd1234' Second substring is 'abcd abcd ' 0:23 >

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Separating substrings from a main string
by Kenosis (Priest) on Jan 24, 2013 at 18:37 UTC

    Were you looking for something like the following?

    use strict; use warnings; use Data::Dumper; my $string = "> abcd1234 abcd abcd >xyz123 xyz"; my @substrings = $string =~ /(>.+)/g; print Dumper \@substrings;

    Output

    $VAR1 = [ '> abcd1234', '>xyz123 ' ];

    In your original posting, you say you ...want the parts from > to the first \n after the > in one string, and \n to the next > in another string. Yet that would yield:

    $VAR2 = [ '> abcd1234', 'abcd abcd' ];

    However, in a reply you say, ...the correct output should be >abcd1234 >xyz123..., and, assuming these are two different strings, this is the output from the above script.