Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Nested grouping or capturing inside capturing

by eddor1614 (Beadle)
on Nov 22, 2011 at 19:13 UTC ( #939519=perlquestion: print w/replies, xml ) Need Help??
eddor1614 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks. I'm looking for an elegant solution to an insignificant issue, already resolved :)
I have my data like this:

abc def data 123 ghi jkl "data with spaces" 456

and I need to print something like this:

abc-def-data1 ghi-jkl-data with spaces

so, space delimits fields, but third field can have spaces if it's surrounded by double quotes.
Right now I'm doing this:

print "$1-$2-".($4 || $3) if /(\w+)\s(\w+)\s("([^"]+)"|(\w+))/;

I'm sure there should be a better way. I need $3 to have the third field, without the quotes. Anyone knows how to do it?

Replies are listed 'Best First'.
Re: Nested grouping or capturing inside capturing
by ww (Bishop) on Nov 22, 2011 at 19:47 UTC
    This is another case (IMO) where two statements would be both
    1. more elegant (because more readable & maintainable)
    2. and...less subject to brain-cramp, whilst originally writing it

    Why not use split as the first; then s/"// each element of the resultant (split) array as the second step prior to printing. Yes, you can argue that that adds two steps (and you would not be UNjustified in making that argument) and thus slows the operation, IMO, the clarity is worth the cost.

      That was exactly what I did a few months ago. And today I spend a few minutes trying to find out why I put s/"//g:

      foreach (<DATA>) { ($a,$b,$c) = ($3,$2,$1) if /^(\w+)\s(\S+)\s("([^"]+)"|(\w+))/; $a =~ s/"//g; . .

      I thought that if I need to match 3 fields, there should by a regexp that sets $1, $2 and $3. Just as a mental exercise. Thanks for your answer.

Re: Nested grouping or capturing inside capturing
by Not_a_Number (Parson) on Nov 22, 2011 at 20:26 UTC

    What if "data with spaces" is not necessarily the third (or nth) field?

    To make your code more robust, try Text::CSV:

    use Text::CSV; my $csv = Text::CSV->new( { sep_char => ' '} ); my $fh = *DATA; # Replace *DATA with your input file name while ( my $row = $csv->getline( $fh ) ) { say join '-', @$row; } __DATA__ abc def data 123 ghi jkl "data with spaces" 456 "oh dear" "more data with spaces"
Re: Nested grouping or capturing inside capturing
by hbm (Hermit) on Nov 22, 2011 at 20:16 UTC

    $+ helps a little:

    #print "$1-$2-".($4 || $3) if /(\w+)\s(\w+)\s("([^"]+)"|(\w+))/; print "$1-$2-$+" if /(\w+)\s(\w+)\s("([^"]+)"|(\w+))/;
Re: Nested grouping or capturing inside capturing
by remiah (Hermit) on Nov 23, 2011 at 13:40 UTC
    Named back reference save your ($4 || $3), but I wonder this is elegant or not.
    print "$1-$2-$+{third}\n" if m/ (\w+)\s (\w+)\s (?: "(?<third>[^"]+)" | (?<third>\w+)) /x;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://939519]
Approved by ww
[Corion]: Discipulus: Whoa! I hope just a minor quake
[Discipulus]: it is probably far from roma.. i hope is not another dramatic event..

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2017-01-18 09:35 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (161 votes). Check out past polls.