replacing clickable web and email addresses

by markolus (Initiate)
on Mar 22, 2002
markolus has asked for the wisdom of the Perl Monks concerning the following question:

Hi there -- I am currently involved in trying to do two things:

1) take input text (in the form of a string parameter value to a script) of potentially email and web addresses and replace them with clickable alternatives.

i.e. to be replaced/rewritten as <a href="/cgi-bin/"></a> and to be rewritten as <a href=""></a>

To to do the above I am working using both Email::Find and URI::Find although I have been using the s!(www.[^\s]+)!<a href="/cgi-bin/$1">$1</a>!gi piece of code up to now for web addresses.

Now the bit that is causing a bit of a headache is doing it the other way round.

i.e. converting the clickable links as shown above back to their text original. Do I need a wierd and wonderful regex and substitition piece of code or is there a module that will make my life easy? I need to be able to reproduce my html link format and not just have a simple target=_blank href?

Any ideas?

Re: replacing clickable web and email addresses
on Mar 22, 2002
    Try HTML::Parser. You should be able to pull the anchors out of your document and replace them with whatever you'd like.

Re: replacing clickable web and email addresses
on Mar 22, 2002
Re: replacing clickable web and email addresses
on Mar 22, 2002
    Try HTML::Parser. using regexes to parse HTML is a nightmare as many will attest to, and HTML::Parser is usually the defacto reccommendation around here when you've got HTML to munge.


