Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Can't figure out how to invert this regex

by Anonymous Monk
on Nov 25, 2005 at 14:35 UTC ( #511657=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I have a regex dilemna. The following regex will replace html entities with the word REPLACED:

s/&.{2,10}?;/REPLACED/g;
However, what I want to do is invert that, so that it replaces & symbols NOT in an html entity with the word REPLACED:

s/(?<!.{2,10}?;)&/REPLACED/g;
returns an error

/(?<!.{2,10}?;)&/: variable length lookbehind not implemented

Where did I derail?

Replies are listed 'Best First'.
Re: Can't figure out how to invert this regex
by merlyn (Sage) on Nov 25, 2005 at 14:45 UTC
    If I recall, an entity can be only alphanums, so this should do it:
    s/&(?![A-Za-z0-9]+;)/REPLACED/g;
    Literally: an ampersand not followed by (1 or more alphanums followed by a semicolon).

    If that's not the correct definition of an entity, season to taste.

    Not sure why you were looking to the left to determine ampersandy-ness. The interesting part is to the right.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.


    update: Ahh yes, the pound form. And the #x form. OK, try this:
    s/&(?!(?:[A-Za-z0-9]+|#\d+|#x[0-9A-Fa-f]+);)/REPLACED/g;
      actually, an entity can also have the form &#123; so there would be a need to check for a pound as well as alpha-num.
      "Not sure why you were looking to the left to determine ampersandy-ness."

      Just a mistake out of my own inexperience with regex's. I didn't really understand where it was looking. Now I get it.
Re: Can't figure out how to invert this regex
by anniyan (Monk) on Nov 25, 2005 at 14:41 UTC

    You are using variable length in negative look behind, as per the perl regex tutorial perlre, we cannot use variable length in negative look behind condition. Rather we can use variable length in positive and negative look ahead.

    updated:Thanks sauoq and apologises for the mistake.

    Regards,
    Anniyan
    (CREATED in HELL by DEVIL to s|EVILS|GOODS|g in WORLD)

      Rather we can use variable length in positive look behind

      No, that's incorrect. Variable length patterns are supported in neither negative nor positive look-behind.

      -sauoq
      "My two cents aren't worth a dime.";
      
      Ah... so:

      s/&(?!.{2,10}?;)/REPLACED/g;
      Gives me what i need. Thanks.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://511657]
Approved by jfroebe
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2018-07-17 02:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (353 votes). Check out past polls.

    Notices?