Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Nested grouping or capturing inside capturing

by eddor1614 (Beadle)
on Nov 22, 2011 at 19:13 UTC ( #939519=perlquestion: print w/ replies, xml ) Need Help??
eddor1614 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks. I'm looking for an elegant solution to an insignificant issue, already resolved :)
I have my data like this:

abc def data 123 ghi jkl "data with spaces" 456

and I need to print something like this:

abc-def-data1 ghi-jkl-data with spaces

so, space delimits fields, but third field can have spaces if it's surrounded by double quotes.
Right now I'm doing this:

print "$1-$2-".($4 || $3) if /(\w+)\s(\w+)\s("([^"]+)"|(\w+))/;

I'm sure there should be a better way. I need $3 to have the third field, without the quotes. Anyone knows how to do it?
Thanks.

Comment on Nested grouping or capturing inside capturing
Select or Download Code
Re: Nested grouping or capturing inside capturing
by ww (Bishop) on Nov 22, 2011 at 19:47 UTC
    This is another case (IMO) where two statements would be both
    1. more elegant (because more readable & maintainable)
    2. and...less subject to brain-cramp, whilst originally writing it

    Why not use split as the first; then s/"// each element of the resultant (split) array as the second step prior to printing. Yes, you can argue that that adds two steps (and you would not be UNjustified in making that argument) and thus slows the operation, IMO, the clarity is worth the cost.

      That was exactly what I did a few months ago. And today I spend a few minutes trying to find out why I put s/"//g:

      foreach (<DATA>) { ($a,$b,$c) = ($3,$2,$1) if /^(\w+)\s(\S+)\s("([^"]+)"|(\w+))/; $a =~ s/"//g; . .

      I thought that if I need to match 3 fields, there should by a regexp that sets $1, $2 and $3. Just as a mental exercise. Thanks for your answer.

Re: Nested grouping or capturing inside capturing
by hbm (Hermit) on Nov 22, 2011 at 20:16 UTC

    $+ helps a little:

    #print "$1-$2-".($4 || $3) if /(\w+)\s(\w+)\s("([^"]+)"|(\w+))/; print "$1-$2-$+" if /(\w+)\s(\w+)\s("([^"]+)"|(\w+))/;
Re: Nested grouping or capturing inside capturing
by Not_a_Number (Parson) on Nov 22, 2011 at 20:26 UTC

    What if "data with spaces" is not necessarily the third (or nth) field?

    To make your code more robust, try Text::CSV:

    use Text::CSV; my $csv = Text::CSV->new( { sep_char => ' '} ); my $fh = *DATA; # Replace *DATA with your input file name while ( my $row = $csv->getline( $fh ) ) { say join '-', @$row; } __DATA__ abc def data 123 ghi jkl "data with spaces" 456 "oh dear" "more data with spaces"
Re: Nested grouping or capturing inside capturing
by remiah (Hermit) on Nov 23, 2011 at 13:40 UTC
    Named back reference save your ($4 || $3), but I wonder this is elegant or not.
    print "$1-$2-$+{third}\n" if m/ (\w+)\s (\w+)\s (?: "(?<third>[^"]+)" | (?<third>\w+)) /x;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://939519]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2014-09-22 03:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (178 votes), past polls