Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Link reg expression

by Anonymous Monk
on Jun 19, 2003 at 18:03 UTC ( [id://267291]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Attempting to get all my http and https links. This gets a hit but was hoping to fine tune this better. Please provide a better reg expression if possible?
if($line =~ /(https?\:.*")/)
The above reg expression should give me this, but it doesn always work because sometimes it fetches the whole line with other end quotes.
<a href="http://www.mylinke.com">Link</a>
I need output like this:
http://www.mylinke.com

Thanks.

Replies are listed 'Best First'.
Re: Link reg expression
by artist (Parson) on Jun 19, 2003 at 18:10 UTC
Re: Link reg expression
by CukiMnstr (Deacon) on Jun 19, 2003 at 18:09 UTC
    it fetchs other end quotes because you are asking it to with .*. If you change your code to:
    if ( $line =~ m/(https?:[^"]+")/ )
    it should work. Also, you could try the non-greedy form:
    if ( $line =~ m/(https?:.+?")/ )
    check perlre for more on the non-greedy quantifiers, and read this post by Ovid on the dangers of .*.

    hope this helps,

    update: you should be using a module specifically designed to deal with links in html documents for this, anyway.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://267291]
Approved by grep
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (2)
As of 2024-04-20 05:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found