Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Regex question

by ultranerds (Friar)
on Sep 22, 2011 at 15:03 UTC ( #927387=perlquestion: print w/replies, xml ) Need Help??
ultranerds has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys,

First of all, lemme show you some example values I need to match:
[url][/url] [img][/img] [url]name of link[/url] [citation]some quote here[/quote]
Now, I have this regex:

while ($_[0] =~ /\[(\/?)(lien|url|citation|img).+?\]/sg) { print "GOT: $1 and $2 \n"; }
..which is meant to capture the / (if it exists, after the [, and assign to $1), and then also capture to tag name itself to $2)

However, it doesn't seem to work (it only picks up the ones without the / in the tag)

If I change the regex to remove .+? , so we have:

while ($_[0] =~ /\[\/?(lien|url|citation|img)\]/sg) { print "BLA: $1 \n"; } works, but obviously doesn't pick up stuff like:

[url]some text[/url]

Anyone got any suggestions?

UPDATE: I think I've worked it out - I needed .*? instead of .+? , cos .*? can mean 0 or more results, whereas .+? means "at least 1")



Replies are listed 'Best First'.
Re: Regex question
by moritz (Cardinal) on Sep 22, 2011 at 15:18 UTC
Re: Regex question
by Anonymous Monk on Sep 22, 2011 at 15:12 UTC

    Anyone got any suggestions?

    :) Use a module already? ;)

    This is what you have

    use YAPE::Regex::Explain; print YAPE::Regex::Explain ->new( qr/\[(\/?)(lien|url|citation|img).+?\]/s )->explain; __END__ The regular expression: (?s-imx:\[(/?)(lien|url|citation|img).+?\]) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?s-imx: group, but do not capture (with . matching \n) (case-sensitive) (with ^ and $ matching normally) (matching whitespace and # normally): ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- /? '/' (optional (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- lien 'lien' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- url 'url' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- citation 'citation' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- img 'img' ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- .+? any character (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- \] ']' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    What do you think is wrong with your regex?

      Yeah, this part ;)
      .+? any character (1 or more times (matching the least amount possible))
      As I updated in the above thread - I changed it to .*? and it works perfectly :)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://927387]
Approved by toolic
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (11)
As of 2018-06-22 10:26 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (124 votes). Check out past polls.