http://www.perlmonks.org?node_id=466491

Marcello has asked for the wisdom of the Perl Monks concerning the following question:

A large Perl application currently queries a certain MySQL table for records. As soon as records are found, they are processed by the application and DELETE'ed from the database. So, basically the table acts as a queue.

This process needs to be as fast as possible. Therefore, the idea is to use multiple Perl processes (2 or more) to query this database and process the records.

The requirements are as follows:
  • Process the records as fast as possible
  • Avoid different processes handling the same record

    Sometimes a process cannot handle records when a connection to a remote server is lost (which could take hours to resolve). At this point, the other processes should handle its records or otherwise the records would stay in the queue for too long.

    I am not sure what kind of logic to implement to meet the requirements above.

    For 2 processes, one could handle the records with odd Id's and the other one with even Id's, but then still the second requirement is not met when one process cannot handle the records.

    If processes SELECT the same records, there is a significant overhead because 50% or more of the records are already handled by other processes and therefore should not be handled.

    What is the best approach I can use to meet these requirements?

    TIA!