Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: Using TheSchwartz - 2 threads pick the same jobs. Any help ?

by BrowserUk (Patriarch)
on Dec 06, 2011 at 21:08 UTC ( [id://942116]=note: print w/replies, xml ) Need Help??


in reply to Re: Using TheSchwartz - 2 threads pick the same jobs. Any help ?
in thread Using TheSchwartz - 2 threads pick the same jobs. Any help ?

"thread safety" & "arrays" don't come into it.

The "queue" in this module appears to be a database table. And the "threads" are probably forked processes.

I say "probably" because after 20 minutes of source diving, I'm still not sure. What I can say is that I saw no sign of threading.

I can also say that this is the single most horrendously complex, over-engineered, stupefyingly over-architected module I've yet encountered. That doesn't mean it doesn't work, or that it might not work very well.

Just that I wouldn't want to be the one responsible for deciding that it has been adequately tested. Or trying to track down bugs.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

  • Comment on Re^2: Using TheSchwartz - 2 threads pick the same jobs. Any help ?

Replies are listed 'Best First'.
Re^3: Using TheSchwartz - 2 threads pick the same jobs. Any help ?
by spx2 (Deacon) on Dec 07, 2011 at 13:08 UTC

    Well how 'bout locking/unlocking them tables to prevent this from happening ?

    About TheSchwartz being over-*, well.. I actually saw a lot of people using it, and lots of jobs on jobs.perl.org featuring TheSchwartz as a required skill .. so I'd imagine we're missing something from the picture, and that it probably is a good module.

    Equally, one could easily write a distributed job queue, using Redis, MongoDB, zeromq, rabbitmq and many others.

      Locking and unlocking tables is a pretty bad way to prevent race conditions compared to just using transactions. Even a pretend "database" like MySQL can support transactions these days. And even in MySQL without transactions, you can still prevent race conditions by making the assignment of a job a single UPDATE statement:

      UPDATE joblist SET pid = ? WHERE jobid = ? AND pid IS NULL

      (Just as an example and not based on having even glanced at TheSchwartz.) Including "pid IS NULL" in the WHERE clause of the UPDATE is what makes this type of assignment step "atomic".

      - tye        

        Seeing as my experience with SQL is completely amateurish (I've only used SQL for things like the CB stats database, never for a commercial-grade system), I want to see if I understand your "pid IS NULL" == "atomic" bit, as that likely would not have occurred to me, largely due to that lack of experience, so this is mostly an effort to internalise it by stating what is likely obvious to more experienced people.

        If we simply had UPDATE joblist SET pid = ? WHERE jobid = ?, the theory is that since the determination of the jobid could be happening on two (or more) threads at the same time, and it is NOT part of the same query, the db could return the same jobid to more than one thread, and then they both try to update the db with their pid. In this case, the last one wins, but all earlier updates thought they were successful, so those threads would not know that their jobid was stolen from them.

        With the pid IS NULL bit, the first thread to claim the jobid still gets it, but all other threads will have this update statement fail. Thus, it is imperative in this system that one checks the return value from the UPDATE (we'd expect "1" if it succeeded in updating what should be a single row, assuming jobid's are unique, which seems like a reasonable assumption here, and "0" if the row was not updated). If the return shows failure, we need to loop back and find a new satisfactory jobid, and try again.

        If I'm understanding this correctly, I may need to go back to my own SQL code to see if this is needed in the CB stats or some such :-) Thanks, tye. And thanks marto for mentioning it in the CB causing me to go looking at it.

      Well how 'bout locking/unlocking those tables to prevent this from happening ?

      Dunno. Maybe that'd work. But if a module's users have to even consider adding such things, then it doesn't bode well for ...

      ... that it probably is a good module.

      A pretty basic requirement of a "reliable job queue", is that once you take a job out of the queue, nobody else will be able to.

      Even ignoring the complexity problems of the implementation, I think that using an RDBMS as the basis of a distributed queue is fraught with problems architecturally speaking. RDBMSs are designed to be servers in a client-server world; supreme masters of the data they control; responsible only for ensuring the total coherency of that data at all times.

      Whilst the big boys -- Oracle, IBM, MS et al -- have add-ons for running their RDBMSs on clusters, they only achieve reliability by throwing multiply redundant high-availability hardware at the problem. Running one of the lesser-mortal free RDBMSs on commodity hardware is never going to achieve reliability.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://942116]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-04-16 05:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found