Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Get the order of HTTP request headers

by arc_of_descent (Hermit)
on May 05, 2010 at 06:52 UTC ( #838453=perlquestion: print w/ replies, xml ) Need Help??
arc_of_descent has asked for the wisdom of the Perl Monks concerning the following question:

Greetings to All,

I asked this question in the chatterbox a couple of weeks back, but looks like this task is tougher than I thought. Hence I'm posting it here in the hopes it reaches a wider audience and of course, a solution.

I need to get the exact order in which the HTTP request headers were sent by the browser from within my CGI script. I need this to perform browser fingerprinting.

All the following don't seem to work

# %header will be in random order my $q = CGI->new; my %header = map { lc($_) => $q->http($_) } $q->http; # %ENV is in random order print Dumper \%ENV my $h = HTTP::Headers->new(%ENV); # this returns the headers in the recommended # "Good Practice" order. print Dumper $h->header_field_names; # this doesn't work as well print $h->as_string;

You can check the order in which the request headers were sent using Firebug.

PHP's getfullheaders() works :(

Any help would be great. Perhaps some Apache module can help? I will be releasing code to this fingerprinting module to CPAN.

Thanks!

Comment on Get the order of HTTP request headers
Download Code
Re: Get the order of HTTP request headers
by Corion (Pope) on May 05, 2010 at 07:16 UTC

    I think you will need to parse the request yourself. HTTP::Headers actively (re)orders the headers on output, and I found overriding that to be nigh-impossible without cut'n'pasting lots of the relevant code out of HTTP::Headers.

    From looking at the source, HTTP::Headers::Util might be suitable to implement your own header parsing.

      The problem is not in HTTP::Headers as I see it. As soon as I pass the %ENV hash or even CGI->http to HTTP::Headers->new(), I lose the order of the headers.

      Thus I think I need to approach this problem by looking closer to the web server (Apache in this case) rather than rely on CGI.

Re: Get the order of HTTP request headers
by ig (Vicar) on May 05, 2010 at 09:32 UTC

    If you're using mod_perl registy scripts, something like the following might give you what you want:

    #!/usr/bin/perl # use strict; use warnings; use Apache2::RequestUtil; use Apache2::RequestRec; $| = 1; print "Content-type: text/plain\n\n"; my $r = Apache2::RequestUtil->request; print $r->as_string();

    On my system, this produces:

    GET /test.pl HTTP/1.1 Host: directory.localhost User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/ +20100401 Ubuntu/9.10 (karmic) Firefox/3.5.9 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cache-Control: max-age=0 HTTP/1.1 (null)

      Thank you! I've got it working under mod_perl.

      Now if only there is some way to get the Apache2::* modules working under a vanilla CGI script.

        Impossible, CGI passes http headers via %ENV

        There are pure Perl HTTP servers available. If you don't need the performance of of mod_perl, maybe one of these would suffice. This would give you direct access to the request.

        Is there any reason not to use mod_perl?

Re: Get the order of HTTP request headers
by furry_marmot (Pilgrim) on May 05, 2010 at 11:24 UTC

    > # %ENV is in random order

    You do realize that hashes do not keep their order, right? They are designed for random access. If you want data back in the original order, use an array.

    my @header = map { [lc($_) , $q->http($_)] } $q->http; print "$_[0] => $_[1]\n" for @header";
      Ahem, %ENV is %ENV
      'http' => <<'END_OF_FUNC', sub http { my ($self,$parameter) = self_or_CGI(@_); if ( defined($parameter) ) { $parameter =~ tr/-a-z/_A-Z/; if ( $parameter =~ /^HTTP(?:_|$)/ ) { return $ENV{$parameter}; } return $ENV{"HTTP_$parameter"}; } return grep { /^HTTP(?:_|$)/ } keys %ENV; } END_OF_FUNC

      Yes. My point was that although $q->http returns a list of HTTP header names, they are not in the same order as the request headers were sent.

Re: Get the order of HTTP request headers
by Haarg (Chaplain) on May 05, 2010 at 15:43 UTC

    This isn't possible with CGI. The CGI RFC doesn't define any mechanism for determining the order of the headers. Additionally, environment variables in general don't have a defined order, and Perl's %ENV hash is randomized. The only way to do this would be to have a different mechanism of communicating with the server software such as ig's mod_perl example.

    It wouldn't be possible to do this in PHP for scripts run as CGI either. Assuming you meant PHP's getallheaders function, it is even documented as only working when running as an Apache module.

      Perl's %ENV hash is randomized.

      First, the order of values returned by hashes isn't random, at least not in any formal sense of the word. There's definitely no attempt to make them random as implied by saying the order is "randomized".

      Second, %ENV is magical. It's not really a hash (although it might use one), so the properties of hash don't necessarily apply.

      This is just a nit as your point stands without the quoted bit.

        First, the order of values returned by hashes isn't random, at least not in any formal sense of the word. There's definitely no attempt to make them random as implied by saying the order is "randomized".

        I always assumed that the order is indeed "randomized". Well, I guess I misunderstood perlsec and perlrun. Or maybe not.

        You are correct. I had meant it in the sense that hashes are unordered and not returned in the same order they are entered, but saying it is random isn't correct.
Re: Get the order of HTTP request headers
by hossman (Prior) on May 06, 2010 at 05:07 UTC
    I need to get the exact order in which the HTTP request headers were sent by the browser from within my CGI script. I need this to perform browser fingerprinting.

    A non-Perl related flaw in your plan is that any HTTP Proxy between you and the browser (and there may be many depending on the networks of the client, the server, and anything in between) can legally reorder the headers, as long as headers with the same name preserve their relative order.

    So while two sets of headers in identical order may indicate the same browser, the same browser sending the identical requests could result in your server receiving the same headers in a completely different order. (I have observed this using several different HTTP Proxy servers)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://838453]
Approved by moritz
Front-paged by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2014-10-25 22:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (149 votes), past polls