Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Last checked flag not updating?

by Aristotle (Chancellor)
on Sep 14, 2002 at 04:16 UTC ( #197786=monkdiscuss: print w/ replies, xml ) Need Help??

Lately I have a problem with this. Clicking "I've checked all of these" will fail randomly, in maybe 1 out of 10-15 times. It seems to work, as the Newest Nodes page I get back after the click is indeed empty (modulo new posts) as it should be. However, refreshing Newest Nodes reveals the flag wasn't updated as all the old nodes that shouldn't be there reappear.

Very irritating if it's the last thing I did and then come back a day later to find all of the old nodes still on the list with no clue at which point to check in. :-/

As "I've checked all of these" works (when it does) across several different browsers/computers that don't share cookies, I have to assume the problem is server side. Or is it? Maybe it is really a browser issue? Has someone else had the same problem and fixed it?

FWIW, I'm using Mozilla 1.0 RC2, build 2002051620, on a Linux 2.2.20 box.

Makeshifts last the longest.

Comment on Last checked flag not updating?
(tye)Re: Last checked flag not updating?
by tye (Cardinal) on Sep 14, 2002 at 05:59 UTC

    Yep. Standard race condition in the base Everything node cache. If someone votes for one of your nodes at the same time as you click "I've checked all of these", then you can have two separate mod_perl processes trying to update the Aristotle node at the same time.

    Even though the other process only wants to change one field in the Aristotle node (your XP), the way the Everything node cache works, that process tells MySQL to update every single field of your node (include rewriting the contents of your home node, your scratch pad, and all of your user settings). So if the other process ends up rewriting your user settings record after your mod_perl process has already rewritten it (with the new Newest Nodes timestamp) but not so much later that it starts the update after your mod_perl process has updated the 'version' number for you node, then your changes to the timestamp get overwritten.

    Note that this design also degrades site performance. I have a (rough) design that improves performance quite a bit (adds one bit of load on the web server in exchange for reducing another bit of load on the web server along with reducing a lot of load on the network and the database server -- and we can add more web servers so this is a good trade-off). Exactly how to do the trade-offs for speed, memory consumption, etc. in the details of this design will be a bit tricky. And it won't eliminate the race condition. It will just greatly reduce the odds of the race condition happening since it will instead require that two separate mod_perl processes try to change the same field of the same node at nearly the same time before the problem can appear.

    But I doubt I'll have the time to even start working on this change for quite a while. So try not to write nodes that people are likely to vote on if this type of thing bothers you a lot. ;)   /:

    Other possibilities include separating out the types of things other people change about you from things you change about yourself so that the impact of the race condition is greatly lessened. It would be a bit strange to have your XP stored in a node separate from your user node, but that would probably be a very good idea (you, the user, wouldn't really notice much difference, of course). But the work involved probably isn't enough less to warrant doing things this strange of a way (because we'd also need to separate out the reputation of a node into a separate node, which would make a huge impact on the data base).

    Those wanting to work on this problem in my stead need to study Everything/NodeCache.pm (this link will only work for people who are already members of pmdev -- others would need to download v0.8 of the Everything engine to see the contents of that file). The basic idea is to store an extra copy of each node in the cache. Then updates only send (to the database) the fields that changed between the original, unchanged (hidden) node in the cache and the changed copy that was previously fetched from the cache. For large fields like node text, something tricky should probably be done to reduce the extra memory required. I have a few ideas here but none that I'm sure will work well.

            - tye (vote early, vote often, it keeps you in top race condition)

      Thanks for the detailed explanation, that makes sense.

      I guess I'll take your advice and not post voteworthy nodes from now on. ;-)

      Say, is any spot open among the pmdevils? I've been thinking that I'd like to contribute to the monastery in more ways than just posting. Though I'm not sure pmdevil is what I really want to do; any suggestions/offers gladly accepted, if I have your (as in the powers that be in general) trust.

      Makeshifts last the longest.

      the way the Everything node cache works, that process tells MySQL to update every single field of your node (include rewriting the contents of your home node, your scratch pad, and all of your user settings).
      Eh... why?

      The way I see it, you should be the only user that ever gets to update your entire own user node. A vote for a post of yours should only update that field, while your own update of your user data should update anything but that field, as you can't vote for yourself. I have no access to the source, I can only assume this is an SQL query definition thing, in which case it should be doable...

      Either that, or put a lock on your user record, and the other process will have to wait until the other one has finished, before reading as well as writing this record.

        The node cache does not know the purpose of any fields nor anything about "users" and so has no way of knowing that "reputation" of "my" nodes nor "experience" of "me" are things that "I" should never need to update. Plus, I do update my own experience because I can get experience points when I vote for other people's nodes.

        Locking a record doesn't do any good unless you read the record when you obtain the lock. This requires that either the record remain locked while it is being viewed ("pessimistic locking", certainly unacceptable in our environment) or that the locked record be compared against the record originally viewed and have the update fail (not be performed) if the comparison fails ("optimistic locking").

        So we'd have to provide an interface for notifying a user that whatever they last attempted has failed (you probably think this would be trivial). Since what the user attempted might apply to more than one node (up voting should increment the node's reputation, decrement the voter's number of votes, and might increment the author's experience), we'd have to have "transactions" so that a failure would attempt to undo the previously successful updates. MySQL doesn't support transactions nor client control over locking. So this would be a huge amount of work for little gain.

        It is easy to say "oh, just lock it" and not really think about the details of the problem, eh?

        We actually have locking for wiki nodes and updates made by editors, but neither of those designs would apply well to these types of situations.

        A much better solution is to not do increment and decrement updates via the normal means of:

        my $NODE= getNode( ... ); $NODE->{reputation}++; updateNode( $NODE );
        but instead provide a function just for these types of updates so that they can be done atomically so that no locking is required:
        my $NODE= getNode( ... ); addToField( $NODE, reputation=>1 ); my $USER= getNode( ... ); addToField( $USER, votesleft=>-1, experience=>1 );
        This will probably cover the vast majority of cases of simultaneously updates to the same field.

        But the original scheme I described is still needed so that the node cache knows which fields are being updated so that it can deal with the race conditions intelligently (as well as making many operations faster).

                - tye (but my friends call me "Tye")
Re: Last checked flag not updating?
by mojotoad (Monsignor) on Sep 14, 2002 at 06:23 UTC
    I'm glad you posted this; I thought I was imagining things. In the past I've usually chalked it up to the beer.

    Thanks for the detailed writeup, tye, it's always interesting to hear about the guts of the monastery.

    Matt

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://197786]
Approved by valdez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2014-09-18 21:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (124 votes), past polls