Beefy Boxes and Bandwidth Generously Provided by pair Networks RobOMonk
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Adding href where needed.

by electronicMacks (Beadle)
on Oct 30, 2000 at 23:37 UTC ( [id://39217]=perlquestion: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.

electronicMacks has asked for the wisdom of the Perl Monks concerning the following question:

I’m maintaining a program that (among other things) reads a users input and, if necessary, inserts the appropriate <a href= > tag. Currently it uses the following regex:

s/((http|ftp|news):\/\/\S*)/<a href=”$1\””>$1<\/a>/g;

The main problem is that this regex will place tags around all urls. This creates problems if the user has already hand coded the tags in their input, such as <a href = http://www.perlmonks.org>Click here</a>

Can any Monks suggest a regular expression that will insert href tags around naked urls, but leave urls within tags alone?

Replies are listed 'Best First'.
Re: Adding href where needed.
by runrig (Abbot) on Oct 30, 2000 at 23:53 UTC
    You can try a negative lookbehind assertion (in perlman:perlre), but it has to be a fixed length pattern, and so won't work if there's spaces around the '=' (so you can first fix that):
    s/href\s*=\s*/href=/ig; s,(?<!(?i:href=))((?:http|ftp|news)://\S*),<a href="$1">$1</a>,g; You might try to think of something besides '\S*' also, in case the user puts in a period after the href, or puts it between parenthesis or ???.
    Update: Actually, if the user DID already make an href, they may or may not have put quotes around the '=' in the url, so maybe the easiest thing to do is rewrite all urls:
    # Now its starting to get ugly! $str =~ s#((?i:<a\s+href\s*=\s*"?)?(?:http|ftp|news)://[^"]+)#substr($ +1,0,1) eq '<' ? $1 : qq!<a href="$1" >$1</a>!#eg;
    I think that's right now (and you still may want to replace that '[^"]' with something else) :-)

    Actually, even that may not work if there's other attributes besides 'href' in the tag, but I'm done. The rest is up to you :-)
Re: Adding href where needed.
by FouRPlaY (Monk) on Oct 31, 2000 at 00:28 UTC
    How about just adding (?:<a href = \")? and (?:<\/a>)? at the begining and end (respectively) of your regexp?

    That's the best suggestion I can give with my Perl knowledge.



    FouRPlaY
    Learning Perl or Going To Die Trying
Re: Adding href where needed.
by moen (Hermit) on Oct 31, 2000 at 00:29 UTC
    You should maybe look into URI::Heuristic for completing your links. Something like this:
    #!/usr/bin/perl use strict; use URI::Heuristic qw(uf_uristr); my $url = uf_uristr("www.perlmonks.com"); print "<a href=$url>$url</a>\n";
    If the user input is www.perlmonks.com or http://www.perlmonks.com it will give you the proper URI. In this case with html code:
    <a href=http://www.perlmonks.com>http://www.perlmonks.com</a>
    using www.perlmonks.com as input.
    It also does ftp and others.

    ok..it's a bit off what you really asked for but i just noted this module at CPAN and liked it ;o)

    Update:
    #!/usr/bin/perl use URI::Heuristic qw(uf_uristr); use strict; my $url; my $input = "<a href = http://www.perlmonks.com>perlmonks</a>"; #my $input = "www.perlmonks.com"; if ($input =~ /^</) { $input =~ /<a\s+href\s*=\s*(.*)>.*<\/a>/i; $url = uf_uristr("$1"); } else { $url = uf_uristr("$input"); } print "<a href=$url>$url</a>";
    This works just fine, also with resolving the url to an uri. You may also drag out the text by enclosing the last .* and drag out the value of $2.
    The regex should work on it's own without using the module also, me thinks.
Re: Adding href where needed.
by Anonymous Monk on Mar 26, 2002 at 10:42 UTC
    my $newcontent = _translateLinks2($initcontent); # sub _translateLinks2 { my $content = shift; my @all = split(/ /,$content); my $skip = 0; my @trans; for my $a (@all) { # print "{ $a }\n"; if ($a =~ m#<a#i) { $skip = 1; } # print "skip : $skip\n"; if ($skip == 0 && $a =~ m#http#i) { if ($a =~ m#^(\n|\r)#) { $a =~ s/\n|\r//gis; $a = qq(\n<a href="$a">$a</a>); } elsif ($a =~ m#(\n|\r)$#) { $a =~ s/\n|\r//gis; $a = qq(<a href="$a">$a</a>\n); } else { $a = qq(<a href="$a">$a</a>); } } elsif ($skip == 0 && $a =~ m#\w+\.com|\w+\.net|\w+\.org|\w+\.gov#i +) { if ($a =~ m#^(\n|\r)#) { $a =~ s/\n|\r//gis; $a = qq(\n<a href="http://$a">http://$a</a>); } elsif ($a =~ m#(\n|\r)$#) { $a =~ s/\n|\r//gis; $a = qq(<a href="http://$a">http://$a</a>\n); } else { $a = qq(<a href="http://$a">http://$a</a>); } } # print "[[ $a ]]\n"; if ($a =~ m#<\/a>#i) { $skip = 0; } push (@trans,$a); } my $translated = join (' ',@trans); return $translated; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://39217]
Approved by root
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.