Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Extracting full links from HTML

by Scott7477 (Chaplain)
on Feb 02, 2007 at 18:17 UTC ( #597985=note: print w/replies, xml ) Need Help??


in reply to Extracting full links from HTML

Here is code that looks for a link to an HTML page from the command line and generates links to each image found in the HTML page. I just took wfsp's code and swapped out his hardcoded links. Update: Also changed the code so that the full URL of each image prints. I figure that would be handy to allow for downloading any or all of the images if so desired.
use strict; use LWP::Simple; use HTML::TokeParser::Simple; #usage imglinker http://www.example.com my $url = shift; my $content = get ($url); my $p = HTML::TokeParser::Simple->new(\$content); my $in_anchor; while (my $t = $p->get_token){ if ($t->is_start_tag('a')){ $in_anchor++; next; } if ($t->is_start_tag('img') and $in_anchor){ my $src = $t->get_attr('src'); print $url."/"."$src\n"; $in_anchor = 0; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://597985]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (2)
As of 2021-06-23 17:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (121 votes). Check out past polls.

    Notices?