Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Split using multiple conditions

by juo (Curate)
on Jun 11, 2005 at 11:24 UTC ( #465789=perlquestion: print w/ replies, xml ) Need Help??
juo has asked for the wisdom of the Perl Monks concerning the following question:

I have been looking to split a line using multiple conditions but have failed to do so. Anybody has an idea.

FDR [62.10060.051-F] [62.10051.381] 0 1 0

For example I want to split the above on space but if I have brackets it should take the whole string and ignore spaces within the bracket area. So in total I want to have six fields. I would like to do this in one split line.

# This can only work untill the first bracket my @feeder_line = split/\s+\[/;

Comment on Split using multiple conditions
Select or Download Code
Re: Split using multiple conditions
by bart (Canon) on Jun 11, 2005 at 11:31 UTC
    That's an official FAQ: perlfaq 4: How can I split a [character] delimited string except when inside [character]?

    Personally, I'd be inclined to use the dual approach: match the stuff between brackets, or nonspaces.

    $_ = 'FDR [62.10060.051-F] [62.10051.381] [this includes spaces!] 0 1 +0'; @parts = /\[.*?\]|[^\[\]\ ]+/g; $\ = "\n"; print for @parts;

    Yes it can be that compact. Result:

    FDR [62.10060.051-F] [62.10051.381] [this includes spaces!] 0 1 0

    A limitation is that you can't easily split on single spaces, thus returning empty strings as a section.

Re: Split using multiple conditions
by mda2 (Hermit) on Jun 11, 2005 at 15:09 UTC
    The bart give a great response! But to understand your question... Your split regex need a quantifier:
    $_ = 'FDR [62.10060.051-F] [62.10051.381] 0 1 0'; @f1 = split/\s+\[/; #>> split only \s+ AND [ ... @f2 = split/\s+\[?/; #>> split \s+ OR \s+[ ... @f3 = split/\]?\s+\[?/; #>> split parts, without []... print join(" + ", @f1), "\n"; print join(" + ", @f2), "\n"; print join(" + ", @f3), "\n"; __END__ FDR + 62.10060.051-F] + 62.10051.381] 0 1 0 FDR + 62.10060.051-F] + 62.10051.381] + 0 + 1 + 0 FDR + 62.10060.051-F + 62.10051.381 + 0 + 1 + 0

    --
    Marco Antonio
    Rio-PM

Re: Split using multiple conditions
by ikegami (Pope) on Jun 11, 2005 at 16:15 UTC

    You can use a single expression like bart showed, but I find the following easier to understand (and maintain):

    # Seperate the fields. my @feeder_line = split /\s+/; # Clean up the data: # Remove the brackets from the 2nd and 3rd fields. foreach (@feeder_line[1, 2]) { s/^\[//; s/\]$//; }
Re: Split using multiple conditions
by dws (Chancellor) on Jun 11, 2005 at 21:29 UTC

    Nother alternative is to remove the brackets first, then split.

    my ($nobrackets = $_) =~ s/(\[|\])//g; my @feeder_line = split ' ', $nobrackets;

      Unfortunately, this doesn't do quite what the original poster asked for. Consider bart's code snippet above and plug it into yours:

      $_ = 'FDR [62.10060.051-F] [62.10051.381] [this includes spaces!] 0 1 +0'; ($nobrackets = $_) =~ s/(\[|\])//g; @feeder_line = split ' ', $nobrackets; $\ = "\n"; print for @feeder_line; __END__ FDR 62.10060.051-F 62.10051.381 this includes spaces! 0 1 0

      N.B.: I have removed the mys because my ($nobrackets = $_) ... results in the error message Can't use global $_ in "my" at - line 1, near "= $_" (the correct syntax is (my $nobrackets = $_) ...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://465789]
Approved by polettix
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2014-12-28 17:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (182 votes), past polls