Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Regex question

by ultranerds (Friar)
on Sep 22, 2011 at 15:03 UTC ( #927387=perlquestion: print w/ replies, xml ) Need Help??
ultranerds has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys,

First of all, lemme show you some example values I need to match:
[url][/url] [img][/img] [url]name of link[/url] [citation]some quote here[/quote]
Now, I have this regex:

while ($_[0] =~ /\[(\/?)(lien|url|citation|img).+?\]/sg) { print "GOT: $1 and $2 \n"; }
..which is meant to capture the / (if it exists, after the [, and assign to $1), and then also capture to tag name itself to $2)

However, it doesn't seem to work (it only picks up the ones without the / in the tag)

If I change the regex to remove .+? , so we have:

while ($_[0] =~ /\[\/?(lien|url|citation|img)\]/sg) { print "BLA: $1 \n"; } works, but obviously doesn't pick up stuff like:

[url]some text[/url]

Anyone got any suggestions?

UPDATE: I think I've worked it out - I needed .*? instead of .+? , cos .*? can mean 0 or more results, whereas .+? means "at least 1")



Replies are listed 'Best First'.
Re: Regex question
by moritz (Cardinal) on Sep 22, 2011 at 15:18 UTC
Re: Regex question
by Anonymous Monk on Sep 22, 2011 at 15:12 UTC

    Anyone got any suggestions?

    :) Use a module already? ;)

    This is what you have

    use YAPE::Regex::Explain; print YAPE::Regex::Explain ->new( qr/\[(\/?)(lien|url|citation|img).+?\]/s )->explain; __END__ The regular expression: (?s-imx:\[(/?)(lien|url|citation|img).+?\]) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?s-imx: group, but do not capture (with . matching \n) (case-sensitive) (with ^ and $ matching normally) (matching whitespace and # normally): ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- /? '/' (optional (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- lien 'lien' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- url 'url' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- citation 'citation' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- img 'img' ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- .+? any character (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- \] ']' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    What do you think is wrong with your regex?

      Yeah, this part ;)
      .+? any character (1 or more times (matching the least amount possible))
      As I updated in the above thread - I changed it to .*? and it works perfectly :)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://927387]
Approved by toolic
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2016-07-26 11:35 GMT
Find Nodes?
    Voting Booth?
    What is your favorite alternate name for a (specific) keyboard key?

    Results (234 votes). Check out past polls.