Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: regex to identify http:// in html

by polypompholyx (Chaplain)
on Nov 26, 2005 at 14:03 UTC ( #511884=note: print w/ replies, xml ) Need Help??


in reply to regex to identify http:// in html

For a quick hack, s{(http://\S+?)(\s+)}{<a href="$1">$1</a>$2} will do what you ask. The important thing to note is the \S+?, which makes the regex non-greedy, i.e. it'll match the minimum amount required for the regex to succeed, rather than the maximum amount, which is what \S+ or .* would do. I've also used \S (any non-space character), as it's best to avoid . where you can: see death to dot star.


Comment on Re: regex to identify http:// in html
Select or Download Code
Re^2: regex to identify http:// in html
by sauoq (Abbot) on Nov 26, 2005 at 14:52 UTC

    Your use of a non-greedy quantifier isn't best here. You are already specifying \S and, since you are being specific, the non-greediness isn't really buying you anything. (In fact, it's somewhat less efficient.) You can also skip the capturing of space at the end. You are just re-adding it anyway, so just leave it alone to begin with. Your regex would be better written as:

    s!(http://\S+)!<a href="$1">$1</a>!g;
    And, you might as well catch https as well:
    s!(https?://\S+)!<a href="$1">$1</a>!g;

    -sauoq
    "My two cents aren't worth a dime.";
    

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://511884]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (11)
As of 2014-07-22 18:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (126 votes), past polls