|go ahead... be a heretic|
I should have clarified a little better exactly what I'm doing. I'm using Net::Telnet, IO::Select and IO::Pipe. As it turns out, changing timeout isn't fixing my problem after all.
I've been testing the code on two elements that are local (ping time 8ms). It works perfectly on these two elements. When I ran a more rigorous test on 20 elements that are located 2000 miles away (ping time 80ms), I ran into problems. The data I received looked exactly like what I get if I time out waiting for a login prompt. However, this wasn't true for all 20 elements.
The first 10 elements work fine, but data from the next 10 degrades more and more until I receive nothing at all from the 20th element. So, even though the results "look" like a timeout problem, this may not be the problem.
The code uses fork() to connect to each network element on a selected list. I was interested to see what happened on the webserver when I ran my forking code, so I viewed "top" while executing it. If you aren't familiar with the Solaris command, "top" shows a dynamic list of currently running processes together with data on CPU time and memory usage.
When I run my CGI script on two network elements, I see two processes pop up, and then disappear when the data has been gathered. If I do the same thing on 5 elements, I see 5 processes. If I try 20 elements, I see about *10* processes that hang around until the script thinks it is finished getting data from all 20 elements.
This leads me to believe that the server has placed some kind of internal limit on child processes in order to prevent overload. This is good, as I was planning on setting a limit anyway.
I believe what I'll do is to calculate the number of forks necessary up front, and do them in batches of 5. It'll take longer to get the work done, but as long as I don't close the HTML page while I'm getting data, the user may begin viewing output from the first set right away.