Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

splitting a string on pre-defined tags

by rajaman (Novice)
on May 22, 2018 at 20:03 UTC ( #1215064=perlquestion: print w/replies, xml ) Need Help??
rajaman has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

$string='C*ID1*Mac*C release for EA's D*ID1*Spore1 game*D; D*ID1*Spore 1*D is better than D*ID2*Spore 2 game*D.';

I am trying to split the above string and save the result in an array in such a way that each tagged segment is a separate array element. For example, 'C*ID1*Mac*C', 'D*ID1*Spore1 game*D', 'D*ID1*Spore 1*D', and 'D*ID2*Spore 2 game*D' should become four separate array elements.

The usual splitting on space or '*' does not work, e.g.:

@fields = split(/\s/, $string);

Please suggest.

Thank you.

Replies are listed 'Best First'.
Re: splitting a string on pre-defined tags
by tybalt89 (Priest) on May 22, 2018 at 20:18 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1215064 use strict; use warnings; use Data::Dumper; my $string='C*ID1*Mac*C release for EA\'s D*ID1*Spore1 game*D; D*ID1*S +pore 1*D is better than D*ID2*Spore 2 game*D.'; my @fields; push @fields, $& while $string =~ /\b([A-Z])\*.*?\*\1\b/g; print Dumper \@fields;

    Outputs:

    $VAR1 = [ 'C*ID1*Mac*C', 'D*ID1*Spore1 game*D', 'D*ID1*Spore 1*D', 'D*ID2*Spore 2 game*D' ];

      You beat me to it!

      use strict; use warnings; use 5.10.0; my $string="C*ID1*Mac*C release for EA's D*ID1*Spore1 game*D; D*ID1*Sp +ore 1*D is better than D*ID2*Spore 2 game*D."; my @fields; while ($string =~ /([A-Z])\* # A single capital letter followed by s +tar ID\d+\* # String 'ID' followed by a number and +a star .*? # Anything \*\1 # A star followed by the original singl +e capital letter /gx) { push @fields, $&; }

      Jim

Re: splitting a string on pre-defined tags
by Corion (Pope) on May 22, 2018 at 20:20 UTC

    Usually it's easier to match what you want to keep instead of splitting on the stuff you don't want:

    #!perl -w use strict; use Data::Dumper; my $string=q{C*ID1*Mac*C release for EA's D*ID1*Spore1 game*D; D*ID1*S +pore 1*D is better than D*ID2*Spore 2 game*D.}; my @sections; while( $string =~ m!(([CD])\*(.*?)\*\2)!g) { push @sections, $1; }; print Dumper \@sections; __END__ $VAR1 = [ 'C*ID1*Mac*C', 'D*ID1*Spore1 game*D', 'D*ID1*Spore 1*D', 'D*ID2*Spore 2 game*D' ];

    The regular expression looks for a C or D followed by a * and then slowly goes forward until it finds a * followed by whatever it matched at the start.

Re: splitting a string on pre-defined tags
by tybalt89 (Priest) on May 22, 2018 at 22:10 UTC

    General rule of thumb:

    If you know what you don't want, use split.

    If you know what you want, use regex.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1215064]
Approved by Perlbotics
Front-paged by haukex
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (1)
As of 2018-07-22 14:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (454 votes). Check out past polls.

    Notices?