Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: How to get ($1, $2, ...)?

by ferreira (Chaplain)
on Feb 16, 2007 at 16:15 UTC ( #600469=note: print w/replies, xml ) Need Help??


in reply to Re: How to get ($1, $2, ...)?
in thread How to get ($1, $2, ...)?

That won't do. I am interested in the order of the regexes and in resuming from where other left. If one uses /$re/, the search will be reset each time. In turn, with /$re/gc I may write code to look for things such as /Title: (.*?)$/, Author: (.*?), and Publisher: (.*?), but will not accept if they come out of order (like "Publisher... Title... Author...").

I have been thinking that I should have phrased this question differently, asking directly for a way to get ($1, $2, ...) in a generic manner and then showing the code for sub _groups. The background that inspired me to formulate the problem could be added as a complement, without obscuring what I was looking for.

Replies are listed 'Best First'.
Re^3: How to get ($1, $2, ...)?
by varian (Chaplain) on Feb 16, 2007 at 16:35 UTC
    If you really need to hang on to the //gmc regex construct then you could opt to include the regex's as alternatives. Afterwards split the grouped result per regex based on field position in the group.
    Update:
    - note that the order of the alternatives influences which one will match first each time (and that's what you wanted right?)
    - since the total regex is just one expression your program will examine the text only once -> performance gain

    See below for an example to get the idea.

    #!/usr/bin/perl use strict; use warnings; my $text = <<TEXT; Title: The Moor's Last Sigh, Author: Salman Rushdie Title: The God of Small Things, Author: Arundhati Roy Title: A very special title, Author: varianf varians TEXT my @answers; my $re = qr/Title: (.*?), Author: (\w+) (\w+)$/; # 3 groups here my $re2= qr/Title: (.*?special.*?), Author: (\w+) (\w+)$/; my (@MatchAll) = ($text =~ /$re2|$re/mgc); my (@Match1,@Match2); for (my $i=0;$i<@MatchAll;$i=$i+6) { defined $MatchAll[$i] && push @Match2, $MatchAll[$i..$i+2]; defined $MatchAll[$i+3] && push @Match1, $MatchAll[$i+3..$i+5]; } Output: $ perl reg.pl .$VAR1 = [ 'A very special title', 'varianf', 'varians' ]; $VAR1 = [ 'The Moor\'s Last Sigh', 'Salman', 'Rushdie', 'The God of Small Things', 'Arundhati', 'Roy' ];
    P.S.: I hardcoded the boundaries for the captured fields to shortcut the coding here. Naturally this part could/should be coded more flexible if you deal with a lot of regex's.
Re^3: How to get ($1, $2, ...)?
by eric256 (Parson) on Feb 16, 2007 at 16:35 UTC

    Since he is comparing line by line instead of the whole doc all at once, it doesn't matter that the next regex starts at the begging even if the last one matched. I know that often my problem isn't getting perl to do what i want, it is thinking i want perl to do one thing when realy there is a better solution. That's why it is good you provide your actual problem because someone might see a solution you are missing, or at very least the insight into the problem will allow people to agree you are doing it the best way, either way you get good information!


    ___________
    Eric Hodges
Re^3: How to get ($1, $2, ...)?
by Anno (Deacon) on Feb 16, 2007 at 16:40 UTC
    I'd try a combination of m//g in scalar context and using the \G marker. If necessary, you can control where it matches by setting pos().

    Sorry for not presenting a coded solution, I don't understand your problem well enough to give one.

    Anno

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://600469]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2019-10-15 01:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?