You'll need to specify in more detail what you mean by accuracy. I assume that "accuracy" relates to the workflow(s) in progress, that is, the state of each state machine. Personally, I would at least write a continous log of each state change of each machine to be able to easily replay/restore a crashed session. Personally, I'm really fond of pushing the problem of keeping (shared) state to a database, so I would at least store the state and possibly also the log of the transition(s) of each state machine in database tables.
One system that sounds a bit like what you're doing is Deliantra, a MORPG written by Marc Lehmann, the author of AnyEvent (among other things). I think it supports a fairly large number of clients and think its overall architecture is likely worth investigating.