Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

regex question

by Samn (Monk)
on Aug 01, 2002 at 02:48 UTC ( #186662=perlquestion: print w/replies, xml ) Need Help??
Samn has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to strip HTML image tags with a regex, the image tags will necessarily not have alt tags, size or double quotes. The code I'm using is $body =~ s/\<img src=(.*)\>/\[image:\<a href=$1\>$1\<\/a\>\]/gi;
The goal is to change <img src=>
which would display as a graphic to [image: <a href=></a>]
which displays as text with a hyperlink. My regex is not working if there are two images in a string, however. I'm not exactly sure why, but I suspect it's encapsulating the first opening image tag and the last closing image tag bits. Any suggestions?

Replies are listed 'Best First'.
Re: regex question
by Zaxo (Archbishop) on Aug 01, 2002 at 03:32 UTC

    use HTML::Parser;

    It handles maniacal markup you'll never think of in your homerolled regexen

    Update: ++mkmcconn suggested I add HTML::TokeParser to the recommendation, and I agree (I knew I was forgetting a good one)

    After Compline,

Re: regex question
by krusty (Hermit) on Aug 01, 2002 at 03:15 UTC
    $body =~ s/<img src=(.*?)>/[image:<a href=$1>$1</a>]/gi;
    Sounds like this might be what you're looking for.

Re: regex question
by Abigail-II (Bishop) on Aug 01, 2002 at 09:44 UTC
    You already identified one of the problems (and solutions have been suggested for that), but let me point out that your regex won't work either if there's whitespace between "src" and "=".

    BTW, HTML doesn't have alt tags. HTML has alt attributes - which have been mandatory for IMG tags for quite some time.


Re: regex question
by Samn (Monk) on Aug 01, 2002 at 02:50 UTC
    Should have used code - The regex should read $body =~ s/\<img src=(.*)\>/\[image:\<a href=$1\>$1\<\/a\>\]/gi;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://186662]
Approved by BrowserUk
[hippo]: That was around April 1994. Fun times.
[hippo]: They had purchased some shiny new 486 DXs the year before but crippled them by installing Win 3.1. I demo'd Linux with X on them to the department head and got the go-ahead to wipe Windows and install the new hotness.
[marto]: glory days ;)

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (12)
As of 2018-06-22 09:54 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (124 votes). Check out past polls.