|Keep It Simple, Stupid|
Getting all Headersby Yappo (Novice)
|on Feb 05, 2008 at 03:48 UTC||Need Help??|
Yappo has asked for the
wisdom of the Perl Monks concerning the following question:
Hi all. I am currently working on an improved spider that is able to get not only HTML content but also media that are not linked from the HTML-source directly.
This information is stored in the headers and can be logged with LiveHTTPheaders and/or Tamper Data (Firefox PlugIns). What I would like to do is log the headers as they come in and are sent just like LiveHTTPheaders plugin does.
I have used UserAgent and WWW::Mechanize-no results, even after sending exactly the same headers as they were logged in FF.
I could get results with tshark but that is not what I want.
Question: is it possible to trace ALL incoming and outgoing headers with a CPAN-module that I can strip the location of the media taht are transported in the headers?
To demonstrate what I mean here is an example header from a LiveHTTPheaders log that plays a video after the URL has been loaded:
I have nver succeeded in getting the Host and GET-variable from the server.