good chemistry is complicated,
and a little bit messy -LW
Reading from more than one socket at onceby ahunter (Monk)
|on Jul 05, 2000 at 00:07 UTC||Need Help??|
It bothered me when I was learning this stuff that the perlipc documentation did not go into enough detail about how to create really fun servers, like the ones used for chat room and so on. Corion asked a question recently touching on this, so I thought it may be useful to go through the correct way to create these things. (Since then, two other people have asked roughly the same question...). Note that I don't have Activestate Perl, so I'm not sure if any of the concepts here differ on platforms other than UNIX and workalikes.
Blocking and bufferingUnder UNIX there are various special types of file (sockets being a good example of one of these) whose data is not necessarily all immediately available. When you read to the end of the currently available data on one of these files, UNIX will wait for further data to arrive. This is known as 'blocking', and means that if you are reading from only one file, you don't have to keep checking to see if theres more data available.
This causes some problems - you don't know if a read from one of these files will cause a block or not, so if you want to read from another file while one is waiting for data, there is no obvious way of going about it. In addition, most of the Perl file handling routines are buffered, and Perl will read a few bytes ahead to improve performance. If Perl gets blocked while reading ahead, it will wait until it can get all the data it wants before continuing - however, while it is waiting, it is not returning the data it may have already read. This means that sometimes your program can appear to have blocked before it has received all the data that you know has already been sent!
The other use of selectJust to make sure everyone gets thoroughly confused, the perl select function has two uses. Its original use is to select the default filehandle for output. This isn't particularily exciting. The other use is the UNIX select(2) call. This call blocks your process until one of four different events occur - data becomes available on a filehandle for reading, a filehandle becomes available for writing, an exception occurs on a filehandle or a timeout occurs. The 'data to be read' and 'timeout' functions of select makes it perfect for writing servers which have to deal with more than one simultaneous connection, or other applications where blocking is a problem.
This form of select can be accessed either through the IO::Select package or through the select call itself. I'll focus on the select call as opposed to the module here - the techniques for both are very similar, though. select takes four arguments: filehandles to wait for data to read, filehandles to wait for availability to write, filehandles to wait for exceptions and a timeout. The first three arguments have a slightly weird format, owing to the heritage of the command, and are altered on return to indicate which file handles caused select to stop blocking. To mark a file handle as one you are interested in, you need to set the bit corresponding to that file handle's number, as returned by fileno, using vec, like so:
And now to wait for data to become available for reading on that filehandle, we use select to do the job:
The undefs here indicate we aren't interested in writing, exceptions or timeouts at the moment. When select returns, $read is changed to contain the list of filehandles with data waiting, and $nfound contains the number of filehandles in the list. The format is still the bitmap, so you need to use vec once again to test if a file is ready for reading:
Buffering againOf course, the same old buffering problems I talked about before still apply, and perl may be over-enthusiastically reading ahead and blocking before you get back to the select, causing hair loss all round. The answer is to never, ever use the standard perl file IO function with sockets. That includes print, eof, the <> notation and just about any file function you can think about. Instead, use sysread and syswrite, which bypass Perl's buffering and record seperation routines and go straight down to the bare metal and just read the raw bytes from the appropriate input stream. You have to deal with newlines and so on yourself, but that's what regular expressions are for. Note that sysread will return undef for an error and 0 for end of file (so you can avoid calling eof) - use $! to get the error message or number (see perlvar).
A multiplexing packageThis is a short package that demonstrates how to use select for reading from several file handles, and also for timing out the select function. Note that the select timer can be specified to microseconds (as a decimal), although its exact precision depends on your operating system. To take advantage of this, we use the time function from Time::HiRes, available from CPAN - note that is equivalent to the standard time function, except that it returns a decimal value, providing higher precision.
Anyway, here's the package:
A TCP acceptor classTo demonstrate the Multiplex class, here is a TCP acceptor. Derive your own objects from it, and override the accepted() method to accept client sockets. Creating a similar object to deal with the client sockets themselves is left as an exercise to the reader (don't forget the importance of only using sysread :-)
For completeness, here is the perl file I used to test these two modules: