Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: LWP strikes again!

by repson (Chaplain)
on Jan 06, 2001 at 10:35 UTC ( #50221=note: print w/replies, xml ) Need Help??


in reply to LWP strikes again!

If you read the html spec at W3C you can find out this:

In HTML, links and references to external images, applets, form-processing programs, style sheets, etc. are always specified by a URI. Relative URIs are resolved according to a base URI, which may come from a variety of sources. The BASE element allows authors to specify a document's base URI explicitly.

When present, the BASE element must appear in the HEAD section of an HTML document, before any element that refers to an external source. The path information specified by the BASE element only affects URIs in the document where the element appears.

So the simplest method is:

my $content = $res->content; $content =~ s#</head>#<base href="http://perlmonks.org"></head>#; print $content;
Otherwise you would need to filter the source with one of the HTML:: parsers and make all urls absolute.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://50221]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2021-10-20 10:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My first memorable Perl project was:







    Results (80 votes). Check out past polls.

    Notices?