Re: regex to identify http:// in html

in reply to regex to identify http:// in html

For a quick hack, s{(http://\S+?)(\s+)}{<a href="$1">$1</a>$2} will do what you ask. The important thing to note is the \S+?, which makes the regex non-greedy, i.e. it'll match the minimum amount required for the regex to succeed, rather than the maximum amount, which is what \S+ or .* would do. I've also used \S (any non-space character), as it's best to avoid . where you can: see death to dot star.

Comment on Re: regex to identify http:// in html Select or Download Code

Replies are listed 'Best First'.
Re^2: regex to identify http:// in html by sauoq (Abbot) on Nov 26, 2005 at 14:52 UTC
Your use of a non-greedy quantifier isn't best here. You are already specifying \S and, since you are being specific, the non-greediness isn't really buying you anything. (In fact, it's somewhat less efficient.) You can also skip the capturing of space at the end. You are just re-adding it anyway, so just leave it alone to begin with. Your regex would be better written as: `s!(http://\S+)!<a href="$1">$1</a>!g;` [download] And, you might as well catch https as well: `s!(https?://\S+)!<a href="$1">$1</a>!g;` [download] -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: regex to identify http:// in html
by sauoq (Abbot) on Nov 26, 2005 at 14:52 UTC

Your use of a non-greedy quantifier isn't best here. You are already specifying \S and, since you are being specific, the non-greediness isn't really buying you anything. (In fact, it's somewhat less efficient.) You can also skip the capturing of space at the end. You are just re-adding it anyway, so just leave it alone to begin with. Your regex would be better written as:

s!(http://\S+)!<a href="$1">$1</a>!g;
[download]

s!(https?://\S+)!<a href="$1">$1</a>!g;
[download]

-sauoq
"My two cents aren't worth a dime.";

[reply]
[d/l]
[select]

In Section Seekers of Perl Wisdom