Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Grouped characters inside character class.

by Enlil (Parson)
on Jun 02, 2006 at 01:39 UTC ( #553198=note: print w/ replies, xml ) Need Help??


in reply to Grouped characters inside character class.

This works:

use strict; use warnings; my $source = 'Posted by mad max beyond eggdome on September 04, 2003'; if ( $source =~ /^Posted by (.*?) on /i ) { print qq("$1") . "\n"; }
which matches:
C:\>perl -MYAPE::Regex::Explain -e "print YAPE::Regex::Explain->new(qr +/^Posted by (.*?) on /)->explain()" The regular expression: (?-imsx:^Posted by (.*?) on ) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- Posted by 'Posted by ' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- on ' on ' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

-enlil


Comment on Re: Grouped characters inside character class.
Select or Download Code
Re^2: Grouped characters inside character class.
by the_0ne (Pilgrim) on Jun 02, 2006 at 01:45 UTC
    Thanks enlil for the response. That does work. My only problem is, I've been warned on perlmonks several times of using .*. I guess in this case it would be fine though because I do want everything grabbed up until the (space)on(space). Maybe I was just trying to be too fancy. :)
Re^2: Grouped characters inside character class.
by m.att (Pilgrim) on Jun 02, 2006 at 01:49 UTC
    The only issue with this regex (and the poster's original idea as well) is it will not properly capture the username if it contains ' on '. For example:

    my $source = 'Posted by getting on your nerves on September 04, 2003';

    It's probably a good idea to anchor on more than just the ' on ' part like:

    my $source = 'Posted by getting on your nerves on September 04, 2003'; if ($source =~ /Posted by (.+?) on \w+ \d{2}, \d{4}$/) { ... }

    Regards

    m.att

      The only issue with this regex (and the poster's original idea as well) is it will not properly capture the username if it contains ' on '

      Noted. So we capture up until the last ' on '.

      use strict; use warnings; my $source = 'Posted by getting on my nerves on September 04, 2003'; if ( $source =~ /^Posted by (.*?) on (?!.* on )/i ) { print qq("$1") . "\n"; }

      blokhead is right.. and I will go lick my wounds now.

        That's the same as just being greedy:
        /^Posted by (.*) on /i

        blokhead

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://553198]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (12)
As of 2014-10-23 09:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (125 votes), past polls