Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Change URIs in Text to HTML-Links

by strat (Canon)
on Nov 07, 2002 at 11:41 UTC ( #211046=perltutorial: print w/ replies, xml ) Need Help??

One way to change URIs in Text to HTML-Links

Use the module URI::Find or URI::Find::Schemeless, e.g

<Update>
Added encode_entities in the following code because of merlyn's answer (Thank you very much!)
</Update>

#! /usr/bin/perl use strict; use warnings; use URI::Find::Schemeless; use HTML::Entities qw(encode_entities); # changed my $text = q~ hello this is no.url this is an url: www.fabiani.net ftp.anything.de/test/thisfile mailto:martin@fabiani.net or the like yeah martin@fabiani.net http://www.fabiani.net/ ~; # create a new URI::Find::Schemeless objekt and add as callback # the function what shell be done with each found URI my $finder = URI::Find::Schemeless->new ( sub { my ($uri, $originalUri) = @_; # error: encode_entities is missing # return qq~<a href="$uri" target="_newpage">$originalUri</a>~; return q/<a href="/ . encode_entities("$uri") . q/">/ . encode_entities($originalUri) . q/>/; } ); # here starts the search (and in our case the replacement): my $howManyFound = $finder->find(\$text); # lets have a look at the result print "$howManyFound URIs found\n"; print "$text\n";
This will replace the following URIs:
  • www.fabiani.net
  • ftp.anything.de/test/thisfile
  • mailto:martin@fabiani.net
  • http://www.fabiani.net/
If you just want to replace the following URIs, use URI::Find, which is more strict:
  • mailto:martin@fabiani.net
  • http://www.fabiani.net/
You can do this by killing ::Schemeless:
use URI::Find; # instead of URI::Find::Schemeless ... my $finder = URI::Find->new # instead of URI::Find::Schemeless ( sub { my ($uri, $originalUri) = @_; # error: encode_entities is missing # return qq~<a href="$uri" target="_newpage">$originalUri</a>~; return q/<a href="/ . encode_entities("$uri") . q/">/ . encode_entities($originalUri) . q/>/; } ); ...
If you dont want the Links to open a browser in a new window, just kill target="_newpage"

It is just a shame that these modules are not standard modules of perl, but I hope that they soon will become.

If your provider hasn't installed them and doesn't want to, just copy the directorries of URI to your webpath, e.g. to cgi-bin/lib and load them perhaps from your cgi-scripts which are located in cgi-bin with the modules FindBin and lib:

BEGIN { use FindBin qw($Bin); use lib "$Bin/lib"; } use URI::Find::Schemeless;

Big thanks to mdupont for pointing me to the new interface of URI::Find (was working with find_uris for a long time, and with a piece of code much too complicated)

Best regards,

strat

Comment on Change URIs in Text to HTML-Links
Select or Download Code
Replies are listed 'Best First'.
•Re: Change URIs in Text to HTML-Links
by merlyn (Sage) on Nov 07, 2002 at 16:37 UTC
    You left out the mandatory "escape HTML entity" calls. Shame on you.
    use HTML::Entities qw(encode_entities); use URI::Find; ... my $finder = URI::Find->new # instead of URI::Find::Schemeless ( sub { my ($uri, $originalUri) = @_; return join "", q/<a href="/, encode_entities("$uri"), q/">/, encode_entities($originalUri), q/>/; } ); ...
    Yes, I've already complained to the author. He "fixed" the manpage incorrectly, by escaping the entire source text first. That breaks in-page URLs that look like http://example.com/foo/bar?a=b&c=d. {sigh} I was too worn out explaining to him why that was wrong to submit another fix.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Change URIs in Text to HTML-Links
by Aristotle (Chancellor) on Nov 08, 2002 at 01:30 UTC
Re: Change URIs in Text to HTML-Links
by Anonymous Monk on Jan 01, 2004 at 14:44 UTC
    The problem with URI::Find is, that it locates URI's inside href tags. URI::Find is alright if your not allowing HTML to be parsed.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perltutorial [id://211046]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (11)
As of 2015-07-28 15:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (257 votes), past polls