Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Chat server impossible with Perl?

by bronto (Priest)
on Feb 04, 2005 at 11:33 UTC ( #428014=perlquestion: print w/ replies, xml ) Need Help??
bronto has asked for the wisdom of the Perl Monks concerning the following question:

Dear brothers and sisters

Building from my few threads experiences so far, I wanted to write some notes and publish them, but I also wanted to move a little forward. I discussed with a friend of mine, that's a Java and J2ME programmer, and he'd like to help me and build a sort of client for J2ME.

Unfortunately (or luckily, as always :-) in J2ME you can't create a server socket, so the principles at the base of my simple chat client of having an application that was both an http server and client couldn't apply. Therefore, we needed a chat server and a protocol; the J2ME chat client could then connect to the chat server and keep the connection open, and all chats would flow in and out from/to that connection.

We designed a small, simple protocol and he created a server using Java. The server does the following:

  • waits for incoming connection from clients
  • when a client connects and a user registers himself, it associates the user name with the client socket that was just opened
  • when a user wants to send a message to another user, it sends it to the server asking to deliver it; the server looks up the right socket by means of the user name and throws the message into it for the other client to read

I was trying to design the same server in Perl, but I am stuck.

I tried to design it using forks, but I encountered a couple of problems:

  • the server sits and accepts for incoming connections; when a new connection is made, it forks a new child to handle the new connection.
    Problem: we cannot associate the socket to the user until the user itself registers; but if it registers with the child process, how could the child inform the parent about the user name?
  • when user A wants to send to user B, it does it through the client socket it previously opened to register, and the server should deliver its message to B.
    Problem: the server should stop accepting and cycle over all its client sockets to see if they have any message to be delivered; if we timeout the accept call to do this and the list of clients is long, we could lose incoming calls. We could well use a thread here, but relying on accept timeouts would coerce me to check for new messages with a granularity of seconds, while I would like to check for new messages to deliver almost continously (say: cycle over client sockets, then sleep a couple of tenth of seconds, then restart).

Then I thought about a full thread approach, but again I hit the limitation that, in my opinion, makes perl threads more similar to a toy than a tool: the impossibility to share objects, especially if they are of the filehandle kin... This makes impossible to have an hash of (ID => SOCKET) shared between the thread that manages the incoming connections and another thread that manages the delivering of user messages, that should run in parallel with it.

My reserve of fantasy, logic and imagination is over for this week. Does anyone of you have a better idea and can try to help me?

Thanks in advance

Ciao!
--bronto


In theory, there is no difference between theory and practice. In practice, there is.

Comment on Chat server impossible with Perl?
Re: Chat server impossible with Perl?
by Frantz (Monk) on Feb 04, 2005 at 11:41 UTC
    Perhaps ce source code of the CGI:IRC script will help you ...

    CGI:IRC
Re: Chat server impossible with Perl?
by merlyn (Sage) on Feb 04, 2005 at 12:10 UTC
    Well, with the help of Apache, I created the poor man's web chat. And there's also a standalone web version for the doctor, which you could mutate into a chat server pretty easily.

    But if you want concurrency, look at POE. Pretty easy to "appear" to do things simultaneously there, sharing the data structures needed. In fact, there's already an IRC server framework there, so you could use standard IRC clients with your "server".

    If you insist on coding from scratch, you can have your child processes communicate via some lightweight database, like DBM::Deep or DBD::SQLite. It'd be pretty simple to write a classic "fork on accept" server that then uses the database to keep in touch with other children to share the chat info. Probably have to use a timed-out read loop to go check the database from time to time.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      ++merlyn for poor man's web chat

      One bug: there's a ->epath where it should be ->path

Re: Chat server impossible with Perl?
by msemtd (Scribe) on Feb 04, 2005 at 13:59 UTC
    I had a similar requirement a year or so ago that I fulfilled with POE: we had a huge distributed EJB system that had constant, difficult to diagnose TCP/IP problems and just would not scale well. I replaced the vile binary JMS and EJB communications with a simple ASCII protocol (that was human-readable), turned the server into a client (in TCP/IP terms) and in the middle, put a tiny ASCII chat server, hardly modified from POE cookbook chat server (http://poe.perl.org/?POE_Cookbook/Chat_Server). The communications were just like a chat room system with individual client systems joining the conversations or "rooms" they required as data feeds. As far as I know it's still running today!

    --
    map & grep are our friends
Re: Chat server impossible with Perl?
by jodrell (Acolyte) on Feb 04, 2005 at 14:34 UTC
    What you need is a multiplexing server that runs in a single process. You should find that Net::Server::Multiplex does what you want. I've written such a server before but haven't got the code handy to post here.
      The net server set of modules are very nice. We use Net::Server::Multiplex it to handle a thousand or so active chat clients. The interface is quite nice. Another nice package, is Event::Lib. The use of Kqueues/Epoll is nice to have when large numbers of clients are connected.
Re: Chat server impossible with Perl?
by kscaldef (Pilgrim) on Feb 04, 2005 at 17:34 UTC

    I would second the above comment. A single process, multiplexing server is going to be the quickest and simplest road to your solution. You can use Net::Server::Multiplex if you want to reuse someone else's solution to get up and running quickly.

    However, if you want to learn how to do this yourself, you should get the Cookbook and read Chapter 17, and particularly recipe 17.13.

Re: Chat server impossible with Perl?
by zakzebrowski (Curate) on Feb 04, 2005 at 18:57 UTC
    You're describing the /msg part of irc. See pircd for a perl implementation of an irc server, and (as previous people have mentioned) Net::IRC the Poe:: modules...


    ----
    Zak - the office
Re: Chat server impossible with Perl?
by sth (Priest) on Feb 04, 2005 at 19:13 UTC

    Besides the CookBook, mentioned above, I would get a copy of Network Programming with Perl. It provides many examples of writting a chat client. Even if you don't use one of them, you will get a good understanding of different methods for writting one. i.e. using Threads, IO::Poll, Pre-Forking sockets.... etc.

Re: Chat server impossible with Perl?
by BrowserUk (Pope) on Feb 04, 2005 at 20:36 UTC
    makes perl threads more similar to a toy than a tool: the impossibility to share objects, especially if they are of the filehandle kind...

    With caveats--mostly surmountable--it is possible to share globs between threads. But it is a bit messy and awkward, which is why I have (mostly) avoided discussing the method publicly.

    I think that this is an unecessary restriction of threads::shared that could and should be lifted. At the basic level filehandles are a per process resource and are accessible from every thread. The current restriction is artificially imposed and can be bypassed. What is required is a documented and tested mechanism for doing shared access--rather than the current situation whereby it can only be done using obscure coding tricks.

    There are two things that prevent me from attempting to get the retriction lifted. The first is my lack of internals skills. The second is that whatever I develop will only be tried and tested on Win32. Without addressing these two areas, the best I could hope to do is offer suggestions for change which is a lot less likely to get anywhere than a "tested on XX and YY" patch.

    All of that said, it is totally possible, and even quite easy to write a simple, complete and efficient chat server using threads. The first thing you have to do is throw away the "fork and exec" approach to the problem, and view it from the perspective of "How do I use threads to listen for new connections whilst maintaining responsiveness to existing ones".

    Once you shed the shackles of preconditioned thinking, it becomes relatively easy :)


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      How do I use threads to listen for new connections whilst maintaining responsiveness to existing ones

      You may not need threads to do that. Although perlipc and perlfunc don't mention it, you can run select on a generic socket, at least on Unix. In more detail:

      You open perlipc, find the section labelled 'Internet TCP Clients and Servers', and use the recipe to create a generic server socket: socket(), setsockopt(), bind(), listen(). You now build a bitmask and go into a select loop. (If you've not used select before, perlfunc explains how. select is a way of sleeping until there's some activity on one or more of any number of connections.) When select says there's something ready for your generic socket to read, it means there's an incoming connection attempt: so you accept the connection, flip a bit in your bitmask to represent the new connection, and go back to the top of your select loop. You're now in an interesting position: select will return when you're ready either to accept a new connection or to read data from an existing connection. So, if you can service an incoming packet quickly enough, a single fibre is all you need.

      That said, I'd be fascinated to know how to share globs between Perl threads. I threw away a day's work earlier this week because Perl told me it couldn't be done and (as this was work time) I decided not to spend time trying to hack it. At the very least, this restriction should be documented in perlthrtut and/or threads::shared.

      Markus

        I'd be fascinated to know how to share globs between Perl threads.

        Supersearch for a thread by me with a title including "threads" and "globs" for my initial experiments in doing this. Basically, you need to pass a "handle" to a glob through a shared variable in such a way that threads::shared doesn't stick it's nose in and reject you. Be warned: The technique I used there has problems.

        I've a couple of other ways of doing it that I am experimenting with, but I would rather keep them quiet till I've proven to myself that they can be used reliably.

        Perl threads have an undeservedly bad rep as it is, without me causing more problems sharing speculative ideas without checking them out first. (Which is why I'm not linking to the post in question--If you want to try it, your gonna have to do a little work:)


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
Re: Chat server impossible with Perl?
by rlb3 (Deacon) on Feb 04, 2005 at 22:46 UTC
    Not that its better than anything else the others have been
    talking about but here is something I posted a while back.
    Small chat server...

    rlb3
Re: Chat server impossible with Perl?
by beauregard (Monk) on Feb 05, 2005 at 00:40 UTC
    The following is a description of a protocol I designed for an job candidate examination:
       Receives lines from a client:
          iam <id>
          need-done <args>
       In response to a need-done, sends to a client that has given
       an "iam" message:
          dothis <args>
       Server ignores any other lines
    
    It's basically a trivial job dispatch protocol where clients register to do work with a central server and are also able to request jobs. Designwise, it's pretty much all that the described chat server would do.

    The exam question asked them to describe the protocol given the following code and then identify at least three security holes (with proposed attacks) in the protocol as well as explain how you'd fix the security problems. Which is why it was written so badly. Sorry...

    use Socket; socket S, PF_INET, SOCK_STREAM, getprotobyname('tcp') or die "$!"; setsockopt S, &SOL_SOCKET, &SO_REUSEADDR, 1 or die "$!"; bind S, sockaddr_in( 23456, INADDR_ANY ) or die "$!"; listen S, 5 or die "$!"; $SIG{__DIE__} = sub { close S; }; $rin = ""; vec( $rin, fileno(S), 1 ) = 1; while( 1 ) { next unless select( $rout = $rin, undef, undef, undef ) > 0; if( vec( $rout, fileno(S), 1 ) ) { local *H; $p = accept( H, S ); select( (select(H), $| = 1)[0]); push @known, *H; vec( $rin, fileno(H), 1 ) = 1; } foreach my $fd (@known) { if( vec( $rout, fileno($fd), 1 ) ) { $s = readline $fd; push @able, $fd if $s =~ /^iam\s+(.+)\s*$/io; if( @able and $s =~ /^need-done\s+(.+)\s*$/io ) { print { $able[(unpack "%32C*",$1) % @able] } "dothis ", $1 +, "\n"; } } } }
    For some reason, only one candidate even attempted the question.

    That aside, it's certainly possible to write a chat server in perl. It's not threaded, but I don't see why you'd really want a threaded server when so much state has to be shared between all the processes and the overhead of serving requests is so low.

    c.

      It looks like there are many opportunities for denial of service attacks:
      • Because accepted connections are never closed, one attack would be to connect and disconnect repeatedly, until your server has used up all of its file descriptors.
      • Another attack would be to connect and send a packet not terminated by an end-of-line marker and then hold the connection open, causing readline in the server to block, thus preventing the server and its horde of worker bees from doing anything else.
      • Another would be to connect with a client that never reads from its end of the connection. Eventually, the server would block while trying to send the misbehaving client a message. The client could speed the process by sending repeated requests with incrementing need-done payloads to cause the checksum-based dispatcher to sweep across all of the "able" file descriptors, guaranteeing that the server would send a request to the misbehaving client. (The client could grab more "able" slots for itself by issuing lots of "iam" messages, but this isn't really necessary.)
      Were these the kinds of things you had in mind?
        It looks like there are many opportunities for denial of service attacks

        Heaping piles of 'em, yeah, and your attacks are correct but you lose marks on 1 and 2 for not describing any fixes.

        Here's some of the "guideline" answers I wrote up at the time (I work for the Canadian government... they want a fairly comprehensive list of expected answers submitted with the questions so there's no "dirty pool" during marking).

        • blocking I/O. don't block waiting for complete requests from clients or you have a DoS. Fix with either handler threads with timeouts, just timeouts, or select/read into multiple buffers.
        • no verification of clients. Need something like a host list. Shared secrets, crypto, etc won't work because of the backwards compatibility constraint.
        • no checking of scripts. Need to ensure that the things like "unlink /etc/passwd" or embedded perl command aren't sent. Server sandboxing (chroot), some kind of safe/taint mode, etc would be needed. In this case, sanitizing the scripts is critical but might not actually be feasible. Some might naively think this is a client problem, but we specifically asked for fixes that would be backwards compatible with older clients so it has to be fixed in the server. In this case, we never actually identified what these scripts do or how they work other than they have to be on one line. 1 of 4 points is for mentioning this as a problem in providing a complete answer.
        • no checking the length of the input. This provides the opportunity for a resource-based DoS. The answer needs to explain that there has to be a limit to request length. 1 of the 4 points require mentioning that there's no way to determine what that limit actually is without a concrete client implementation.
        • lack of error checking in the implementation means that when a client goes away, we don't know and continue to send "dothis" messages to it. Enough of those clients will cause a resource-based DoS (file handles) as well as polluting the list of hosts enough that we couldn't process requests. Server needs to see when things go away and remove hosts from the list.

        Hmmm... forgot to mention the backwards compatibility constraint. Any proposed "fix" was supposed to be backwards compatible with the original protocol. This does mean that some problems could never really be fixed, but... Well, the question was supposed to reflect some work we do regularly... take a really old and badly (un)documented piece of code, understand (and document) how it works, and fix/improve it without breaking the entire infrastructure.

        c.

        The client could grab more "able" slots for itself by issuing lots of "iam" messages, but this isn't really necessary.

        Neat, I never noticed that attack. It's subtly different from a file descriptor DoS but probably nastier because there's no external limits on how much resources can be used.

        c.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://428014]
Approved by pelagic
Front-paged by dada
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2014-10-22 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (112 votes), past polls