Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Adding a 'Host' request header leads to unexpected LWP::UserAgent behaviour (proof example.com)

by Anonymous Monk
on Jun 16, 2016 at 23:57 UTC ( [id://1165919]=note: print w/replies, xml ) Need Help??


in reply to Adding a 'Host' request header leads to unexpected LWP::UserAgent behaviour

Stumped! Can anyone shed any light?

Yeah, what BrowserUk said, websites can do whatever they want, for example, a local http echo server

$ lwp-request -E http://127.0.0.1:80 GET http://127.0.0.1:80 User-Agent: lwp-request/6.15 libwww-perl/6.15 200 Assumed OK Client-Date: Thu, 16 Jun 2016 23:35:40 GMT Client-Peer: 127.0.0.1:80 Client-Response-Num: 1 Echo: GET / HTTP/1.1 Echo: TE: deflate,gzip;q=0.3 Echo: Connection: TE, close Echo: Host: 127.0.0.1:80 Echo: User-Agent: lwp-request/6.15 libwww-perl/6.15 Echo: $ lwp-request -E http://127.0.0.1:80 -HHost:Host GET http://127.0.0.1:80 Host: Host User-Agent: lwp-request/6.15 libwww-perl/6.15 200 Assumed OK Client-Date: Thu, 16 Jun 2016 23:35:42 GMT Client-Peer: 127.0.0.1:80 Client-Response-Num: 1 Echo: GET / HTTP/1.1 Echo: TE: deflate,gzip;q=0.3 Echo: Connection: TE, close Echo: Host: Host Echo: User-Agent: lwp-request/6.15 libwww-perl/6.15 Echo:

So you see the "host" header was added successfully and returned

With the real website example.com it works exactly the same

$ lwp-request -E http://93.184.216.34:80 GET http://93.184.216.34:80 User-Agent: lwp-request/6.15 libwww-perl/6.15 404 Not Found Connection: close Date: Thu, 16 Jun 2016 23:26:51 GMT Server: ECS (rhv/81A7) Content-Length: 345 Content-Type: text/html Client-Date: Thu, 16 Jun 2016 23:37:48 GMT Client-Peer: 93.184.216.34:80 Client-Response-Num: 1 Title: 404 - Not Found <?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>404 - Not Found</title> </head> <body> <h1>404 - Not Found</h1> </body> </html> $ lwp-request -E http://93.184.216.34:80 -Hhost:example.com GET http://93.184.216.34:80 Host: example.com User-Agent: lwp-request/6.15 libwww-perl/6.15 200 OK Cache-Control: max-age=604800 Connection: close Date: Thu, 16 Jun 2016 23:27:01 GMT Accept-Ranges: bytes ETag: "359670651" Server: ECS (rhv/81A7) Vary: Accept-Encoding Content-Length: 1270 Content-Type: text/html Expires: Thu, 23 Jun 2016 23:27:01 GMT Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT Client-Date: Thu, 16 Jun 2016 23:37:58 GMT Client-Peer: 93.184.216.34:80 Client-Response-Num: 1 Title: Example Domain X-Cache: HIT X-Ec-Custom-Error: 1 X-Meta-Charset: utf-8 X-Meta-Viewport: width=device-width, initial-scale=1 <!doctype html> <html> <head> <title>Example Domain</title> ...

If you do  lwp-request -E http://google.com you'll get a redirect to http://www.google.com

If you do  lwp-request -E http://google.com -HHost:Host google won't redirect it will just return 404

websits can do what they want

  • Comment on Re: Adding a 'Host' request header leads to unexpected LWP::UserAgent behaviour (proof example.com)
  • Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1165919]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-20 04:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found