Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Regex not quite working- [b] tags

by ultranerds (Friar)
on Sep 20, 2011 at 10:49 UTC ( #926897=perlquestion: print w/replies, xml ) Need Help??
ultranerds has asked for the wisdom of the Perl Monks concerning the following question:


I'm trying to get this code working, which will grab all the contents between b and /b, so that we can check to see the length of those values (basically we wanna check to see if they have abused the "bold" feature on the site ;))

At the moment I have:


This works to an extent, but I think there is something wrong with my regex, as it seems to not be doing quite what I'm after. Could someone please point out where I'm going wrong with this?



Replies are listed 'Best First'.
Re: Regex not quite working- [b] tags
by Anonymous Monk on Sep 20, 2011 at 11:11 UTC

    This is what you have

    use YAPE::Regex::Explain; print YAPE::Regex::Explain ->new( qr/\[b\]([^\[\/b\]]+)/ )->explain; __END__ The regular expression: (?-imsx:\[b\]([^\[/b\]]+)) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- b 'b' ---------------------------------------------------------------------- \] ']' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- [^\[/b\]]+ any character except: '\[', '/', 'b', '\]' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    This is what you want

    m{ \[b\] ( [^\[]+ ) \[/b\] }sixgs

    .+? will work just fine in place of [^\[]+

Re: Regex not quite working- [b] tags
by pvaldes (Chaplain) on Sep 20, 2011 at 11:23 UTC
    grab all the contents between b and /b

    If I had correctly understood the idea, this is

    / \[b\] # match a [b] .*? # match anything (non greedy) \[\/b\] # match the next [/b] /xg

    maybe a problem with the greedy nature of your regex?

      Thanks guys, I tried something like this before, but I must have been missing the ? ... doh!

      while ($post_message =~ /\[b\](.+?)\[\/b\]/sg) { print "BOLD: $1 \n\n"; $bold_length += length($1); }
      Working like a charm now - thanks!


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://926897]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (11)
As of 2018-06-22 16:15 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (124 votes). Check out past polls.