Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Grouped characters inside character class.

by Enlil (Parson)
on Jun 02, 2006 at 01:39 UTC ( #553198=note: print w/ replies, xml ) Need Help??


in reply to Grouped characters inside character class.

This works:

use strict; use warnings; my $source = 'Posted by mad max beyond eggdome on September 04, 2003'; if ( $source =~ /^Posted by (.*?) on /i ) { print qq("$1") . "\n"; }
which matches:
C:\>perl -MYAPE::Regex::Explain -e "print YAPE::Regex::Explain->new(qr +/^Posted by (.*?) on /)->explain()" The regular expression: (?-imsx:^Posted by (.*?) on ) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- Posted by 'Posted by ' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- on ' on ' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

-enlil


Comment on Re: Grouped characters inside character class.
Select or Download Code
Replies are listed 'Best First'.
Re^2: Grouped characters inside character class.
by m.att (Pilgrim) on Jun 02, 2006 at 01:49 UTC
    The only issue with this regex (and the poster's original idea as well) is it will not properly capture the username if it contains ' on '. For example:

    my $source = 'Posted by getting on your nerves on September 04, 2003';

    It's probably a good idea to anchor on more than just the ' on ' part like:

    my $source = 'Posted by getting on your nerves on September 04, 2003'; if ($source =~ /Posted by (.+?) on \w+ \d{2}, \d{4}$/) { ... }

    Regards

    m.att

      The only issue with this regex (and the poster's original idea as well) is it will not properly capture the username if it contains ' on '

      Noted. So we capture up until the last ' on '.

      use strict; use warnings; my $source = 'Posted by getting on my nerves on September 04, 2003'; if ( $source =~ /^Posted by (.*?) on (?!.* on )/i ) { print qq("$1") . "\n"; }

      blokhead is right.. and I will go lick my wounds now.

        That's the same as just being greedy:
        /^Posted by (.*) on /i

        blokhead

Re^2: Grouped characters inside character class.
by the_0ne (Pilgrim) on Jun 02, 2006 at 01:45 UTC
    Thanks enlil for the response. That does work. My only problem is, I've been warned on perlmonks several times of using .*. I guess in this case it would be fine though because I do want everything grabbed up until the (space)on(space). Maybe I was just trying to be too fancy. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://553198]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2015-07-31 03:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls