Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

regex question

by Samn (Monk)
on Aug 01, 2002 at 02:48 UTC ( #186662=perlquestion: print w/replies, xml ) Need Help??
Samn has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to strip HTML image tags with a regex, the image tags will necessarily not have alt tags, size or double quotes. The code I'm using is $body =~ s/\<img src=(.*)\>/\[image:\<a href=$1\>$1\<\/a\>\]/gi;
The goal is to change <img src=>
which would display as a graphic to [image: <a href=></a>]
which displays as text with a hyperlink. My regex is not working if there are two images in a string, however. I'm not exactly sure why, but I suspect it's encapsulating the first opening image tag and the last closing image tag bits. Any suggestions?

Replies are listed 'Best First'.
Re: regex question
by Zaxo (Archbishop) on Aug 01, 2002 at 03:32 UTC

    use HTML::Parser;

    It handles maniacal markup you'll never think of in your homerolled regexen

    Update: ++mkmcconn suggested I add HTML::TokeParser to the recommendation, and I agree (I knew I was forgetting a good one)

    After Compline,

Re: regex question
by krusty (Hermit) on Aug 01, 2002 at 03:15 UTC
    $body =~ s/<img src=(.*?)>/[image:<a href=$1>$1</a>]/gi;
    Sounds like this might be what you're looking for.

Re: regex question
by Abigail-II (Bishop) on Aug 01, 2002 at 09:44 UTC
    You already identified one of the problems (and solutions have been suggested for that), but let me point out that your regex won't work either if there's whitespace between "src" and "=".

    BTW, HTML doesn't have alt tags. HTML has alt attributes - which have been mandatory for IMG tags for quite some time.


Re: regex question
by Samn (Monk) on Aug 01, 2002 at 02:50 UTC
    Should have used code - The regex should read $body =~ s/\<img src=(.*)\>/\[image:\<a href=$1\>$1\<\/a\>\]/gi;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://186662]
Approved by BrowserUk
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2018-04-21 21:03 GMT
Find Nodes?
    Voting Booth?