Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Re: A demanding parser

by gmax (Abbot)
on Jan 26, 2002 at 18:56 UTC ( #141785=note: print w/replies, xml ) Need Help??

in reply to Re: A demanding parser
in thread A demanding parser

Thanks for the tip. I am not sure I understand how to use it, though.
My purpose, as you have pointed out, is to replace Regexp::Common with some normal Perl RegEx. By normal I mean a non-module dependant expression.
As for the motivation, you guessed right that it's related to education. Personally, I wouldn't bother. I need to distribute this module as part of a more extensive educational material aiming at the build-up of a huge database. I would like to avoid pointing to a CPAN module, since many people in the audience are not experienced Perl users. They should just copy this module to their computers and execute the import/export script.
Of course I can provide them with a copy of the module, or instruct them to connect to the CPAN, download the module and install it, or use "perl -MCPAN -e shell" but it would steal valuable time from my lectures.

That aside, here is a test script for your RegEx, which does not seem to give me what I want.
Was it my misunderstanding, or were you trying to show me how to catch the inner parenthesized text only?
#!/usr/bin/perl -w use strict; use Regexp::Common; my $re = qr{ \( (?: (?> [^()]+ ) | (??{ $re }))* \) }x; my $input = "aa bb cc (dd ee (ff gg (hh) jj) kk)"; print "With module\n"; while ($input =~ m/(\w+|$RE{balanced}{-parens=>'()'})\s*/g) { print "$1\n"; } print "With recursive RegExp\n"; while ($input =~ m/(\w+|$re)\s*/g) { print "$1\n"; } __END__ # output: With module aa bb cc (dd ee (ff gg (hh) jj) kk) With recursive RegExp aa bb cc dd ee ff gg (hh) jj kk
Found the problem. Recursive RegExes don't work properly with use strict
my $re = qr{ \( (?: (?> [^()]+ ) | (??{ $re }))* \) }x;
no strict 'vars'; $rec_re = qr{ \( (?: (?> [^()]+ ) | (??{ $rec_re }))* \) }x; my $re = $rec_re; use strict;
makes the same output from both regexes.
 _  _ _  _  
(_|| | |(_|><

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://141785]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2018-11-12 20:12 GMT
Find Nodes?
    Voting Booth?
    My code is most likely broken because:

    Results (144 votes). Check out past polls.