Texan has asked for the wisdom of the Perl Monks concerning the following question:

I have written a script to do bulk hostname lookups. Give my script a list in a text file and it will do the lookup and give you a text based output with the information it found. I want to know if threading this is:
A) Possible, I am looking at several thousand lookups and would like to streamline and speed up this code as much as possible.
B) Is the best way to speed up this script. Right now it takes 3-6 hours to do 6000 lookups. I would like to trim this time down to as little as possible.


2004-10-22 Edited by Arunbear: Changed title from 'Threading', as per Monastery guidelines

  • Comment on Using Threading to speed up DNS Resolution

Replies are listed 'Best First'.
Re: Using Threading to speed up DNS Resolution
by matija (Priest) on Oct 22, 2004 at 15:16 UTC
    Yes, it's possible to thred it, even without threading in your own script. Look at the docs for NET::DNS.

    The best way to speed up the script depends a lot on the input. Make sure that the IP's are unique, or cache the results (in other words, only look up each IP number ONCE).

    Also note that a fast "looker-upper" is going to place a significant load on whatever DNS you're using, and that the admin of that DNS might come looking for you with a stick.

Re: Using Threading to speed up DNS Resolution
by Elian (Parson) on Oct 22, 2004 at 15:22 UTC
    This is more work than you probably realize. Most name resolver libraries are single-threaded no matter what you do -- that is, regardless of whether you spawn off a zillion threads there'll only be one outstanding name lookup request for your process. If this is the case for you (and it may well be. Check before you do anything) then you'll need to find or write your own name resolution library. It's not horribly tough, but it does mean learning how to do DNS' wire protocol.

    Running a local name server and talking to it rather than some remote server will probably speed things up quite a bit as well.

      Actually most gethostby*_r() implementations can be used with threading with no significant issues. Perhaps you are thinking of the non reentrant gethostby*()?

      No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

        No, you misunderstand. It's not that the interface isn't threadsafe. It's that the implementation is single-threaded. That is, no matter how many threads are making name lookup requests, there'll only be one actually in-flight at any one time. Having 50 threads making simultaneous name lookups will get you 1 thread doing a name lookup and 49 threads waiting their turn to issue the name lookup.
Re:Using Threading to speed up DNS Resolution
by BrowserUk (Pope) on Oct 22, 2004 at 19:50 UTC

    Not withstanding Elain's point, you might find Re: Parallel DNS queries (use threads; ) and the associated thread interesting.

    I only did brief tests with the code back then, and they were not scientific, but on Win32, it appeared as if the DNS lookups where overlapped to me.

    Doing 10 lookups in parrallel taking approximately a fifth of the time required to do them sequentially. I have no idea whether this would be true for other OS's

    It might be possible to improve that by using NET::DNS objects rather than spawning an external process, but I haven't tried that.

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon