Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

LWP strikes again!

by motomuse (Sexton)
on Jan 06, 2001 at 10:21 UTC ( [id://50218]=perlquestion: print w/replies, xml ) Need Help??

motomuse has asked for the wisdom of the Perl Monks concerning the following question:

Right, after considerable research, I've finally gotten to the point where my Perl script connects to a remote server and POSTs data to it successfully. But there's a problem. The page may /look/ like the right page on the remote server, but surprise! the URL in the browser's field is that of my script. Which means that when I click on a link, if the URL is relative rather than absolute, I get a 404 error.

For example, here's http://www.elfhill.com/foo.cgi:

use HTTP::Request::Common qw( POST ); use LWP::UserAgent; print "Content-type:text/html\n\n"; my $ua = new LWP::UserAgent; my $req = POST 'http://perlmonks.org/index.pl', [ node_id => '479' ]; my $res = $ua->request($req)->content; if ($res->is_success) { print $res->content; } else { print $res->error_as_HTML; }
Now, if you try to click on any link there, elfhill.com will spit back a 404.

How do I tweak the code so that I'm /actually/ at perlmonks, rather than elfhill?

Thanks,

    - Muse

Replies are listed 'Best First'.
Re: LWP strikes again!
by repson (Chaplain) on Jan 06, 2001 at 10:35 UTC
    If you read the html spec at W3C you can find out this:

    In HTML, links and references to external images, applets, form-processing programs, style sheets, etc. are always specified by a URI. Relative URIs are resolved according to a base URI, which may come from a variety of sources. The BASE element allows authors to specify a document's base URI explicitly.

    When present, the BASE element must appear in the HEAD section of an HTML document, before any element that refers to an external source. The path information specified by the BASE element only affects URIs in the document where the element appears.

    So the simplest method is:

    my $content = $res->content; $content =~ s#</head>#<base href="http://perlmonks.org"></head>#; print $content;
    Otherwise you would need to filter the source with one of the HTML:: parsers and make all urls absolute.
Re: LWP strikes again!
by chipmunk (Parson) on Jan 06, 2001 at 10:28 UTC
    Okay, if you actually want to be at perlmonks, I would wonder why you're not just doing a redirect. However, if you need to pass the page through your script, you could add a BASE tag to the page, or munge all the links so they point back to your script, which would request the new page from perlmonks and spit it out. That's how online translation services like the Dialectizer work.
Re: LWP strikes again!
by merlyn (Sage) on Jan 06, 2001 at 12:16 UTC
Re: LWP strikes again!
by motomuse (Sexton) on Jan 09, 2001 at 01:51 UTC
    Thanks, monks, that was most helpful.

    Yes, chipmunk, I do need to pass the page thru my script (this is really just an example, the actual site to which I'm intending to send stuff is PayPal, which is set up in such a way as not to be able to cope with the concept of an item for sale having more than one attribute, such as (specifically) t-shirts, which come in a variation of sizes and colors... long story.

    Thanks, repson, for providing a concrete example of what you and cm meant by a BASE tag. For having worked in HTML as long as I have, you'd think I'd remember that kind of thing, but that's the kind of detail I forget more often than I'd like to.

    Thanks, merlyn; for the moment, since my script is only ever going to be hitting paypal.com, the BASE tag is all I really need, but that's a really interesting article that I know will come in useful to me in the long run.

    Again, thanks all,

         - Muse

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://50218]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (2)
As of 2024-04-20 03:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found