Re^9: Thread terminating abnormally COND

in reply to Re^8: Thread terminating abnormally COND_SIGNAL(6)
in thread Thread terminating abnormally COND_SIGNAL(6)

Now lets say the 1st command finishes, and updates the database, incrementing its session counter

(This is just me thinking how I would avoid your undefined jobnode issue. If you're happy with your current solution, stick with it.:)

Your current mechanism uses a single queue for all resources; and has to constantly poll each of the pending jobs and then check the required resource to see if it is available. This puts a very expensive 'busy loop' at the heart of your scheduler.

If you have a lot of jobs queued for popular (or slow; or both) resources at the head of your queue; you are going to be constantly polling over those pending jobs and querying the DB for the resource status; in order to get to the newer pending requests for less popular/higher throughput resources. In other words, the slower, more popular resources become a bottleneck in front of all your other faster, more transient ones.

That smacks of a scaling issue being designed into the very heart of your processing.

I would tackle the queuing for resources in a quite different way:

Jobs get queued to resource specific queues.
Implementation: A hash of queues keyed by resource name or id..
When a job is enqueued, it is inspected and is added to the appropriate queue.
You have another 'jobDone' queue. When a job is completed, its jobid/resourceid pair is queued to this queue.
And the heart of your scheduler is a thread reading (dequeuing, not polling), that jobDone queue.
As it pulls each jobid/resourceid off that queue, it:
1. Updates the job status in %nodes.
2. Updates the DB resource status (if still necessary).
3. Takes the next pending job off the queue associated with the now freed resource and puts it on the work queue.
4. Goes back to dequeue the next done job; blocking if there's nothing there.

My understanding was that when you enqueue something into the Thread::Queue, you get shared_clone of that object and thus cannot make direct modifications to the object. The hash is simply the means for returning information to the jobNode in the originating thread.

That is true, and I can see how this is affecting your design decisions. But not for the good.

The fact that when you send an object (node) via a queue means you get a copy means that you now require a second queue to send the (modified) node back to somewhere else so that it can read the modified copies information and update the original. This gets complicated and expensive. And is completely unnecessary!

You have your shared %nodes Every thread can access that structure directly. So don't queue nodes; queue node ids.

When the receiver dequeues, instead of getting an unshared clone of the node, it just gets an ID string, that it uses to directly access the shared %nodes, to read information and update directly.

Now you have no need for the return queue (nor anything to read it!). All your queue messages become lighter and faster; and there is one central, definitive, always up-to-date copy of the complex, structured information.

This next bit is speculative and would require careful thought and analysis; but if your architecture lends itself, you may even avoid the need to do locking on %nodes.

If you can arrange that each jobid/resourceid pair token, (a string), is (can only) be created once, then as it gets passed around, only one thread at a time can ever hold it, so there is no need to lock.

I can almost feel your frustration as you read this and are thinking: "But I'd have to re-write the whole damn thing"! If it doesn't work for you, just don't do it! You don't even have to tell me :)

But consider the ideas, because done right, you'd have a light, fast, self-synchronising, scalable process with no bottlenecks and no cpu-sapping busy-loops.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

When a job is enqueued, it is inspected and is added to the appropriate queue.

In Section Seekers of Perl Wisdom