|Perl: the Markov chain saw|
My first guess is that no, it wouldn't help. It would greatly reduce memory usage on the system. If that then lead to making the cache size a lot bigger, then that would help. It would have other benefits as well.
But it would also thwart my plan for greatly reducing the impact of the race condition. Right now, you get a node via the cache, then you make updates to it, then you request that the updates be saved to the database (via the cache). Between "get node" and "save node", there is a pretty high likelyhood that someone else also started making changes to the node. And we are stuck with optimistic locking and ignoring failures (because of the nature of the site) so the best we can do is have each process only update fields that it changed. So we have to track which fields this process changed. And there are two ways to do that: 1) note each change, ie. tie, ie. slow things down too much; 2) save a copy of the original and compare (my plan).
So how do we save a copy of the original? In a perfect world, there would be a "I plan to change this node" version of the "get node" function and you wouldn't be allowed to save changes to nodes that you didn't get via that function. Getting from the PM world to that world would require huge numbers of changes all over the place. So I'm not shooting for that.
So I need to have the "get node" function save a copy of the node separate from the one that gets handed to everyone and that might get changed. Well, the node cache is the perfect place for this. However, if the node cache is shared between processes, then the copy saved in the node cache is no longer going be "what this process originally got" so it isn't useful for making that comparison to reduce the race condition.
So I don't want an inter-process node cache. Not until we have some other way to track changes to fields in nodes. That is a long way off.
What I'd love is an inter-process nodelet and miscellaneous data (such as chatter in a few forms, approval status, etc.) cache. Nodelets are currently cached and only updated if they are more than N seconds old (where N is configured per nodelet). But each process has a seperate cache so most nodelets are updated P times every N seconds (where P is the number of httpd/mod_perl processes in use). So a shared cache reduces the load for many of the nodelets by about a factor of P (dividing by even 16 is a very nice thing to do to system load).
Negotiating updates in such a cache gets rather complicated and varies by what you want to cache, and this all gets more complicated because you want to see updates from other computers also.- tye