Re: How to get ($1, $2, ...)?

Perhaps put your reg exes in ~~a loop~~ an array?

#!/usr/bin/perl 

use strict;
use warnings;
use Data::Dumper;

my @res = (
  qr/Title: (.*?), Author: (\w+) (\w+)$/,
  qr/Title: (.*?), Author: (\w+) (\w+) Publisher: (\w+)$/,
  qr/Title: (.*?), Author: (\w+) (\w+) Publisher: (\w+) Year: (\w+)$/,
);

my @answers;
while (my $line = <DATA>){
  for my $re (@res){
    my @results;
    if (@results = $line =~ /$re/){
      push @answers, [@results];
    }
  }
}

print Dumper \@answers;

__DATA__
Title: The Moor's Last Sigh, Author: Salman Rushdie
Title: The God of Small Things, Author: Arundhati Roy
Title: one, Author: two three Publisher: four
Title: five, Author: six seven Publisher: eight Year: nine
[download]

output:

$VAR1 = [
          [
            'The Moor\'s Last Sigh',
            'Salman',
            'Rushdie'
          ],
          [
            'The God of Small Things',
            'Arundhati',
            'Roy'
          ],
          [
            'one',
            'two',
            'three',
            'four'
          ],
          [
            'five',
            'six',
            'seven',
            'eight',
            'nine'
          ]
        ];
[download]

updated:
tinkered with the format of the output
update 2:
forgot to update the code. :-( Thanks to Tanktalus for spotting it.

Comment on Re: How to get ($1, $2, ...)? Select or Download Code

Replies are listed 'Best First'.
Re^2: How to get ($1, $2, ...)? by Anno (Deacon) on Feb 16, 2007 at 16:28 UTC
Reproducing a bit of your code: `my @answers; while (my $line = <DATA>){ for my $re (@res){ my @results; if (@results = $line =~ /$re/){ push @answers, ["@results"];` [download] Why the quotes around @results? They weren't in the version that produced the output you're showing. `} } }` [download] You're also making an unnecessary copy of the array @results. Its scope is the loop body, so you have a new one each time through. Just take the reference: `# ... for my $re (@res){ my @results; push @answers, \ @results if @results = $line =~ $re; } # ...` [download] Anno	[reply] [d/l] [select]
Re^3: How to get ($1, $2, ...)? by Anno (Deacon) on Feb 16, 2007 at 21:47 UTC
Oh, or even `# ... push @answers, grep @$_, map [ $line =~ $_], @res; # ...` [download] instead of the for loop over `@res`. I realize I'm expanding on a non-solution to the original question. It's art for art's sake, if that's allowed. Anno	[reply] [d/l] [select]
Re^2: How to get ($1, $2, ...)? by ferreira (Chaplain) on Feb 16, 2007 at 16:15 UTC
That won't do. I am interested in the order of the regexes and in resuming from where other left. If one uses `/$re/`, the search will be reset each time. In turn, with `/$re/gc` I may write code to look for things such as `/Title: (.?)$/`, `Author: (.?)`, and `Publisher: (.*?)`, but will not accept if they come out of order (like `"Publisher... Title... Author..."`). I have been thinking that I should have phrased this question differently, asking directly for a way to get `($1, $2, ...)` in a generic manner and then showing the code for sub `_groups`. The background that inspired me to formulate the problem could be added as a complement, without obscuring what I was looking for.	[reply] [d/l] [select]
Re^3: How to get ($1, $2, ...)? by varian (Chaplain) on Feb 16, 2007 at 16:35 UTC
If you really need to hang on to the //gmc regex construct then you could opt to include the regex's as alternatives. Afterwards split the grouped result per regex based on field position in the group. Update: - note that the order of the alternatives influences which one will match first each time (and that's what you wanted right?) - since the total regex is just one expression your program will examine the text only once -> performance gain See below for an example to get the idea. #!/usr/bin/perl use strict; use warnings; my $text = <<TEXT; Title: The Moor's Last Sigh, Author: Salman Rushdie Title: The God of Small Things, Author: Arundhati Roy Title: A very special title, Author: varianf varians TEXT my @answers; my $re = qr/Title: (.?), Author: (\w+) (\w+)$/; # 3 groups here my $re2= qr/Title: (.?special.*?), Author: (\w+) (\w+)$/; my (@MatchAll) = ($text =~ /$re2\|$re/mgc); my (@Match1,@Match2); for (my $i=0;$i<@MatchAll;$i=$i+6) { defined $MatchAll[$i] && push @Match2, $MatchAll[$i..$i+2]; defined $MatchAll[$i+3] && push @Match1, $MatchAll[$i+3..$i+5]; } Output: $ perl reg.pl .$VAR1 = [ 'A very special title', 'varianf', 'varians' ]; $VAR1 = [ 'The Moor\'s Last Sigh', 'Salman', 'Rushdie', 'The God of Small Things', 'Arundhati', 'Roy' ]; [download] P.S.: I hardcoded the boundaries for the captured fields to shortcut the coding here. Naturally this part could/should be coded more flexible if you deal with a lot of regex's.	[reply] [d/l]
Re^3: How to get ($1, $2, ...)? by eric256 (Parson) on Feb 16, 2007 at 16:35 UTC
Since he is comparing line by line instead of the whole doc all at once, it doesn't matter that the next regex starts at the begging even if the last one matched. I know that often my problem isn't getting perl to do what i want, it is thinking i want perl to do one thing when realy there is a better solution. That's why it is good you provide your actual problem because someone might see a solution you are missing, or at very least the insight into the problem will allow people to agree you are doing it the best way, either way you get good information! ___________ Eric Hodges	[reply]
Re^3: How to get ($1, $2, ...)? by Anno (Deacon) on Feb 16, 2007 at 16:40 UTC
I'd try a combination of `m//g` in scalar context and using the `\G` marker. If necessary, you can control where it matches by setting pos(). Sorry for not presenting a coded solution, I don't understand your problem well enough to give one. Anno	[reply] [d/l] [select]