http://www.perlmonks.org?node_id=10929


in reply to Resolve addresses in web access logs

Maybe I'm wrong about this, but it looks like you don't verify the name returned by gethostbyaddr. You probably don't need to if it's just for web statistics, but if you, like me, are in the habit of looking back over old code to remember how to do something, it might be a good idea to put that in or at least put in a comment about it, in case you need a more certain resolution for the ip in the future.

There's a discussion of this in Perl Cookbook, section 17.7 ('Identifying the Other End of a Socket'). It basically says that because a name lookup goes to the name owner's DNS server, there's the possibility that the machine could give false information. Using gethostbyname and comparing the answer to find the original ip checks that. It also mentions that it's still not 100% secure.

I wish I'd checked the code catacombs yesterday before I wrote my own version of this for exactly the same purpose. Bleh. Bad Kudra.

Replies are listed 'Best First'.
RE: RE: Resolve addresses in web access logs
by zodiac (Beadle) on May 10, 2000 at 16:27 UTC
    Maybe you are wrong. or may be I am but:
    gethostbyaddr returns the names matching the ip. reverse name entries is as secure as dns gets. if the ip has a reverse name, the ip for that name will match the ip.
    the discussion in the Cookbook is about looking, whether the ipaddress you got when looking up by name, matches the original name. which it will not, unless you have a reverse entry for the same name.
      I don't think there's any looking up by name--in the example the IP was grabbed with getpeername and the name isn't known. If you were to get the ip with gethostbyname and then use gethostbyaddr on the result, you would be verifying it as they suggest, just in reverse.

      Quoting extensively from the Cookbook:
      "...If you want the name of the remote end, call gethostbyaddr to look up the name of the machine in the DNS tables, right?

      "Not really. That's only half the solution. Because a name lookup goes to the name's owner's DNS server and a lookup of an IP addresses goes to the address's owner's DNS server, you have to contend with the possibility that the machine that connecteed to you is giving incorrect names. For instance, the machine evil.crackers.org could belong to malevolent cyberpirates who tell their DNS server that its IP address (1.2.3.4) should be identified as trusted.dod.gov. If your program trusts trusted.dod.gov, a connection from evil.crackers.org will cause getpeername to return the right IP address (1.2.3.4), but gethostbyaddr will return the duplicitous name (my italics).

      "To avoid this problem, we take the (possibly deceitful) name returned by gethostbyaddr and look it up again with gethostbyname..."

      I'm just repeating, but it looks to me as if this is talking about gethostbyaddr having the potential to give incorrect information.

RE: RE: Resolve addresses in web access logs
by ZZamboni (Curate) on May 22, 2000 at 18:04 UTC
    You are correct. Reverse DNS can easily give wrong information (if the bad guy controls his DNS server, he also controls the reverse table). I know about this, but I don't care too much about it for web access statistics.

    --ZZamboni

      I wouldn't care either with web stats. I was just suggesting that as an example piece of code you might want to add a comment about that problem/feature so that if someone adapted it for an application which did require checking, s/he would know about it.