perrin has asked for the wisdom of the Perl Monks concerning the following question:
I'm going to talking at YAPC in a couple of weeks about techniques for website scalability. One of the things I'll be discussing is the use of job queuing to handle large numbers of requests for a limited resource. My main example will be what Ticketmaster does on their site. However, the Ticketmaster code for this is tightly coupled to their backend systems and is thus not open source. I want to give people some open source examples to look at, so I'm wondering what others have been using.
Ideally I am looking for something with these characteristics:
- works across a cluster of machines
- relatively easy to administer
- provides control over how many jobs are running at once
- allows asynchronous calls, i.e. a client sends in a request and then can check on the status and eventually the results later on
Here are the ones that I've turned up. Any suggestions of better ones, or feedback on the effectiveness of these tools would be useful.
- Spread::Queue
- This is the one that currently seems best to me. It uses the Spread toolkit to handle the communications between clients, a queue manager process, and worker processes.
- Parallel::PVM
- PVM seems like a kind of old-school system built for running scientific applications on clusters. It's looking a little long in the tooth, but I have not tried it yet.
- Parallel::MPI and Parallel::MPI::Simple
- MPI is basically a newer replacement for PVM, but still more oriented towards distributed processing for science apps.
- IBM's grid stuff
- I need to look into this one more. It seems to use some more modern protocols (web services) but other than that it's unclear what advantage this might have over PVM and MPI.
Re: what do you use for job queuing?
by eduardo (Curate) on Jun 04, 2004 at 16:30 UTC
|
I've used a sendmail / IMAP setup for this before. It's an excellent way of providing a persistent distributed routable queuing architecture, with a well known and documented API for handling complex messaging. A lot of people give you quizzical looks when you explain to them how to set this sort of thing up, but then again, it's how "job queuing" is done if the endpoints are people, so why wouldn't it scale splendidly to programs? Assign different processes different email addresses, write a procmail file writer to help you do load balancing if need be accross a cluster Built right in you get point to point and message list distribution, even request receipts and complex attachment hierarchies if you so chose :) It's my favorite solution to this particular problem. | [reply] |
|
I agree that e-mail offers some attractive features in terms of queueing messages and handling temporary delivery failures robustly, and I have heard people talk about doing it this way before. I would say it mostly just solves the message queue issue though.
It might have issues with frequent polling, just like a database-backed queue would, although I suppose an e-mail server is built to do that well. It would also be difficult to get the current status of a job, but maybe "received" is good enough for most cases.
I think you would end up needing to build some extra infrastructure to allow a client to check for the response to a specific job, or is there a simple way to do that with IMAP queries and subject lines or something? Also, how would you handle distributing messages across a set of worker processes? Would you use e-mail for that too, sending messages on from a dispatcher to specific workers with unique e-mail addresses?
| [reply] |
Re: what do you use for job queuing?
by Itatsumaki (Friar) on Jun 04, 2004 at 16:03 UTC
|
If I understand the topic correctly, you might want to consider the "trivial" solution of setting up a dispatch table in a database and running monitoring scripts on each machine in the cluster. Each monitor-script would check for new work to process in the database, and update the dispatch table with "in-progress" and "completed" flags as need be. Of course this only works when the individual pieces can easily be broken apart, but I think this is often true for web-applications.
-Tats
| [reply] |
|
| [reply] |
Re: what do you use for job queuing?
by monkey_boy (Priest) on Jun 04, 2004 at 16:27 UTC
|
| [reply] |
Re: what do you use for job queuing?
by exussum0 (Vicar) on Jun 04, 2004 at 16:45 UTC
|
MPI is newer, but I'm not sure how widley used it is compared to other technologies today. Further more, it's more a definition and API to transmitting data just as DOM and XML are...
Or so I understand it.
Spread, as I've seen it, does synchronization of messages on top of communicating them, which is great to talk about.
Maybe you can discuss the various patterns used as well involved in such a system. There are many distributed programming texts about. Maybe implement something ad hoc demonstrating the use of priority queues, mutex locking and the likes?
--
Bart: God, Schmod. I want my monkey-man.
| [reply] |
|
I have limited time for this talk (about 40 minutes) and this isn't my only subject, so I can't go into a really lengthy discussion about it. However, if I do end up writing one, I will probably try to lean on a networked database for locking and general concurrency issues. I was trying to think of ways to avoid the incessant polling that could happen with a central-server approach like that (every worker process checking for new jobs, etc.). Let me know if you have any ideas about that. It may not really be worth worrying about, since these would be pretty simple queries, but I remember how much the Oracle DBAs at one place where I used to work complained about ATG Dynamo's RDBMS-based implementation of JMS. It would hit the database constantly.
| [reply] |
Re: what do you use for job queuing?
by merlyn (Sage) on Jun 04, 2004 at 17:42 UTC
|
| [reply] |
Re: what do you use for job queuing?
by Anonymous Monk on Jun 04, 2004 at 21:35 UTC
|
This is something of a 'concept' reply but I hope it helps you clarify
your choice parameters. I also have a 'remote' :) interest in this topic.
The implementation of parallel virtual machines and more pedestrian remote job control
share a lot in common, but they are different inhabitants of the same house. PVM type
message passing is the key to efficient distributed processing and beowulf clustering
where your code is not localised. The engine at the heart of which is really no more than
a scheduler with the added sense to know when it is cost effective to spawn a sub-process
to a new node, or compute it locally.
Fault tolerance, generally quick but flexible handling of delivery time, are desirable
for PVM construction and ability to tunnel ssh or other vpn less so.
The granularity of control in PVMs is fine. Clocks are synchronised and packets sent
are many and small. The whole cluster has become a giant fault tolerant single processor.
Processes that remotely fork pass code as well as data amongst the nodes.
Many scientific algorithms have been optimised for such execution and incorporate the
necessary forking cues.
Job control is less complicated but has its own issues. For the most part you are only
interested in reliable message passing functions, forking procedures and harvesting their
results are not part of the plan. Code is local and specific to the nodes, which often
perform a single well defined task. The message passing is to remotely execute procedures
which already live with their data, so authentication and reliable accounting of the remote
machine state are desirable. This is really distributed control, and because of the timescales
involved, and the high end-to-end reliability you can build atop even email and passing messages
via a pop box is perfectly practical and usable., but monitoring scripts need to operate on a finer
timescale.
Grid computing, which is what IBMs take is about, falls somewhere in the middle, where you are
not building a parallel supercomputer, and you need more than remote job execution.
The Martin Brown (IBM) article is ok imho, it mentions using POE, which I had not
considered, but also SOAP and OGSI frameworks. You could do all this with sockets, but it could
become ugly (as I can attest), and since I am not a user of any of those libraries I can't speak
from experience. My suggestion would be to further study SOAP and OGSI protocols and see
how they help you with data queuing problem. Grid computing takes the desirable feature of PVMs
which is transparrent replication (shrinking and growing of the process pool), and this solves
your requirement of easy admin. You set up clusters once. Machines may come, machines may go, but
the like the axe that has had 8 new handles and 8 new heads, its still the same axe (cluster).
As to your problem. I guessed it had something to do with music before I looked at the ticket site :)
Concert ticketing falls right in there with electronic voting and the stock exchange. You are designing
a system to cope with a one off transient peak in demand. What you need is a spike handler.
Just sticking the requests in a queue is nasty, your users have no reliable way of knowing
availability, and if they don't know their queue position they have no way to know if their purchase
will be honoured. The economics atm are favorable for a solution like you (perhaps unwittingly)
are heading for. Many companies would naively
dupe the site across many boxes rented for a very small amount of time around the product launch
and let Apache mods load balance the spike out. However this is always going to be an order of
magnitude more expensive than an ISP who offers clustering and can take a kiloslash (1000 times
the power of a slashdotting:) and stand up. The actual peak is remarkably short in time.
I think an on demand replication system based on PVM principles
would look nice so I think you are on the right track. Check out the modperl list archives too
I've a feeling there was a thing, Stas or Randal wiuld know, about this in the past, there may well
be an Apache mod-perl solution to just this problem, but event driven from the demand (web request) end.
If there isn't I guess you're writing it :)
Best of Luck
Andy | [reply] |
Re: what do you use for job queuing?
by andyf (Pilgrim) on Jun 07, 2004 at 06:32 UTC
|
Perrin, a penny just dropped, are you aka Perrin Harkins? If so I hope you found it funny rather than insulting that I refer you to the mod perl list. :) | [reply] |
|
| [reply] |
|
|