in reply to Re^4: Help designing a threaded service in thread Help designing a threaded service
I sheepishly take it from your reply that I can't have a multi-threaded listener setup the way I see multiple forks of a fork-based network server taking connections on one port.
No. It is entirely feasible to do. Its just a completely ridiculous way to design a server.
With forks (*nix), when you have multiple processes all waiting to accept on a shared socket, when a client connects, *every* listening process receives the connect.
It is obviously a nonsense to have multiple server processes attempting to conduct concurrent communications with a single client, so now those multiple server processes need to arbitrate between themselves in order to decide which of them will pick up the phone.
Envisage an office with a shared extension and a dozen bored people all shouting "I'll get it!" at the top of their voices ... or all of them pretending not to hear it hoping someone else will be bugged out by the constant ringing and answer it before they weaken and do so.
In *nix server land the solution is for all of the listening processes to fight over acquiring a global mutex. One wins; and the other go back to whatever they were doing before they were so rudely interrupted.
Of course, in your scenario, if the client calling back to collect his output happens to randomly connect to a different process to the one that he connected to when he made his hostname/command request -- an odds on favorite scenario -- then he's sh*t outta luck; cos that process has no way of knowing that one of the other processes is gathering output for this client.
You wanna do thing the st hard way; have fun.......
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^6: Help designing a threaded service
by zwon (Abbot) on Jan 26, 2014 at 15:42 UTC
|
With forks (*nix), when you have multiple processes all waiting to accept on a shared socket, when a client connects, *every* listening process receives the connect.
That would be horrible, but fortunately it's not true. If multiple processes waiting for a connection on the same socket, when client connects, *only one* listening process accepts connection. Here's a simple example that demonstrates it:
use 5.010;
use strict;
use warnings;
use IO::Socket::INET;
my $sock = IO::Socket::INET->new(LocalPort => 7777, Listen => 10);
for (1..3) {
my $pid = fork;
unless($pid) {
my $cli = $sock->accept;
say "Process $$ accepted connection from " . $cli->peerport;
print while <$cli>;
exit 0;
}
}
Try to connect to 7777 and you will see that only one process will accept connection. Hence there's no need to have any global mutexes. | [reply] [d/l] |
|
Hm. The description was based upon the implementation of nginx server.
Which states that:
After the main NGINX process reads the configuration file and forks into the
configured number of worker processes, each worker process enters into a
loop where it waits for any events on its respective set of sockets.
Each worker process starts off with just the listening sockets, since there
are no connections available yet. Therefore, the event descriptor set for
each worker process starts off with just the listening sockets.
When a connection arrives on any of the listening sockets (POP3/IMAP/SMTP),
each worker process emerges from its event poll, since each NGINX worker
process inherits the listening socket. Then, each NGINX worker process
will attempt to acquire a global mutex. One of the worker processes will
acquire the lock, whereas the others will go back to their respective event
polling loops.
Meanwhile, the worker process that acquired the global mutex will examine
the triggered events, and will create necessary work queue requests for
each event that was triggered. An event corresponds to a single socket
descriptor from the set of descriptors that the worker was watching for
events from.
*nix isn't my world, so I'll leave it to you and others to decide if your observations or the implementation of a widely used and well tested server is correct here.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
On at least some versions of some Unix systems, multiple processes waiting on the same socket will cause all of them to be awoken but only the first one to ask will get the connection or data that triggered them to be awoken. Since nginx is setting up "necessary work queue requests" in order to handle the connection coming it, it is useful for only one process to do that. Though I'm not completely convinced that the nginx authors didn't implement this protection out of misunderstanding rather than real need.
I believe that it is the case that you don't need to worry about this implementation detail at least in most cases.
My vague memory of one report of this "every process wakes up" "problem" was just noting the wasted resources and that only one of the waiting processes would return from select(2) (or equivalent). I certainly don't expect more than one process to actually return from accept() when many of them are blocked inside an accept() call.
| [reply] |
|
|
| [reply] |
|
|
|
Re^6: Help designing a threaded service
by Tommy (Chaplain) on Jan 26, 2014 at 20:08 UTC
|
You wanna do thing the st hard way; have fun.......
It's actually out of a desire to avoid doing things the stupid way that I posted my question to the Monastery in the first place. An abundance of problems have been pointed out by several people -- problems that are already solved by existing "wheels" on the CPAN that I'd almost certainly be better off to not reinvent.
So my "sane" options are singular: to extend a given stable wheel (Net::Server?) via some sort of data/state sharing mechanism so that when any given listener is presented with a task ID, that it is able to retrieve the output of that task beginning from the last time it was polled. Designing and implementing that will be my biggest challenge. I still want to avoid using a database for this, but may have to fall back on that option.
As for the stupidity, the overhead of constantly polling the service every N seconds is a necessary evil that I see no way to avoid. The aces up my sleeve are that it's not going to have to expand beyond ~20 concurrent users for the foreseeable future, and it's on a very, very fast LAN. I've already personally seen Net::Server scale well past that kind of load on lesser networks.
I appreciate your insight BrowserUk, and that of all others who have joined in on the conversation.
Tommy
A mistake can be valuable or costly, depending on how faithfully you pursue correction
| [reply] |
|
So my "sane" options are singular: to extend a given stable wheel (Net::Server?) via some sort of data/state sharing mechanism so that when any given listener is presented with a task ID, that it is able to retrieve the output of that task beginning from the last time it was polled.
Problem is, you've bought into the fallacy that the behemoth that is Net::Server is going to solve some high proportion of your problem. It won't.
The very essence of your project is the problem of routing multiple discrete client connects back to the sources of their data. Handling the connects (the only bit that Net::Server will do for you) is the easy part. Plumbing the appropriate clients to their appropriate data sources is a plumbing problem that Net::Server has no solution to. And using a forking solution for that means you are going to be forced into a multiplexing nightmare.
Which is just silly when -- by your own statement -- you only need ~20 clients, which takes just 36MB for 21 threads:
perl -Mthreads=stack_size,4096 -MIO::Socket -E"async{ IO::Socket::INET
+->new( 'localhost:' . 12300 + threads->tid ); sleep }->detach for 1 .
+. 20; sleep"
A bit more for some shared memory and all your routing problems become trivial.
C'est la vie :)
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
Your response is exactly why I want to go with threading: because I can solve all my hard problems via shared memory vars, easily reconnecting a request to it's output stream. But I have to face some unhappy facts: I don't know enough about threading to know what's necessary to keep the thread memory consumption from ballooning out of control. I also don't know how to handle corner cases that I've yet to identify with the creation of a highly-available multi-threaded network service. And finally I don't yet know gracefully kill off threads that get "stuck" without resorting to a SIGKILL (which isn't exactly a showstopper, but the other issues are).
The multi-threaded IRC chat bot code that was shared earlier in this discussion is too bare-boned to inspire confidence that extending it could handle the problems I've outlined above. If I venture down this path, I'd need a map and a guide. And frankly the latter and more essential of the two is hard to come by in *nix land where forking is king and threading has a bad reputation for, from what I can tell, all stupid reasons.
Tommy
A mistake can be valuable or costly, depending on how faithfully you pursue correction
| [reply] |
|
|
|