Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^4: avoiding a race (much ado)

by tye (Cardinal)
on Sep 29, 2010 at 19:05 UTC ( #862683=note: print w/ replies, xml ) Need Help??


in reply to Re^3: avoiding a race ("No extra", "no-user" locking--miniscule race of no importance)
in thread avoiding a race

Actually, the Linux kernel holds a mutex (exclusive not shared lock) against the directory during readdir (for example) so re-using the kernel locking means that the processes must do their searching for a matching error (file) in single-file. This is probably part of why directories containing a large number of files are notoriously extremely slow in Unix. (And Perl's glob / readdir is notorious for at least sometimes being pathologically slow under Windows but I don't know what locking Windows uses for directory operations.)

And so the "no extra locking" claim is completely bogus. The extra locking case is "a ton of kernel mutex uses" not the "small number of kernel mutex uses to open a file and then get one shared lock".

I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there. And you may not care about a few extra e-mails (or tons of extra e-mails when the directory gets bloated and pathologically slow to use) but the person asking the "avoiding a race" question probably does. I can certainly understand caring about my boss getting duplicate e-mails after he assigned me the task of making sure we don't get duplicate e-mails.

But given 300 processes, I'd probably go with a separate, single process that de-dups errors rather than having 300 processes fighting over the list of errors (whether stored as lines in a file or files in a directory).

- tye        


Comment on Re^4: avoiding a race (much ado)
Re^5: avoiding a race (the ado, you do, so well:)
by BrowserUk (Pope) on Sep 29, 2010 at 19:56 UTC
    I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there.

    Hm. The window your "still racy" referred to, was the time between a failed stat, and open. Ie:

    open ERROR, '>', $file unless -e $file; close ERROR;

    And even in a full directory and an a loaded system, that time is going to be measured--assuming you can actually measure it at all--in low milliseconds at the most.

    Now, what does that actually mean?

    It means that one of the other processes encountered the same error as you, and succeeded in creating the error file within those few milliseconds. So what?

    You then immediately overwrote it with a later time-stamp. But still, so what?

    Nothing! Because the error file got created. Nothing is going to take any action--like sending emails--as a result of that files creation for another hour. From the OP:

    if it has been encountered before and the time stamp is greater than an hour ago it will mail the admin

    So the very worst affect of some other process creating the file instead of you, is that the sending of the email is delayed by the difference between the original time-stamp, and the new one. And that's just a few millseconds at most.

    #! perl -slw use Time::HiRes qw[ time ]; use threads qw[ stack_size 4096 ]; my $file = 'theFile'; async{ 1 until -e $file; }->detach for 1 .. 300; sleep 3; my @times = time; unless( -e $file ) { push @times, time; open FILE, '>', $file or die "$file : $!"; push @times, time; close FILE; } push @times, time; print for @times; unlink $file; __END__ [20:51:05.40] C:\test>junk49 1285789900.703 1285789900.87509 1285789901.02394 1285789901.324

    With 300 clients stating theFile (in a directory containing 1000 files), the window of opportunity for this irrelevant race condition is all of 300 milliseconds.

    And then only if the time-stamp resolution of the file-system is sufficient to actually discern the difference, which is unlikely.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      And even in a full directory and an a loaded system, that time is going to be measured--assuming you can actually measure it at all--in low milliseconds at the most.

      No, not at all. Opening the file requires finding the file which requires traversing the (possibly long) directory contents yet again (and thus contending with all of the mutex contention again also). With NTFS or a newer Linux file system (with the proper options enabled), then the directory won't be stored as a simple list and the performance is probably not as easily pathological. A few months ago I again ran into a directory with way too many files it in and it took many seconds, even minutes, to open a file (or to remove one). I haven't tried to replicate the problem on a more modern filesystem to see how well it scales. But I suspect there are plenty of file systems left in the world that were built without hash/tree directories.

      And then only if the time-stamp resolution of the file-system is sufficient to actually discern the difference, which is unlikely.

      And there you have your broken analysis, again. If X and Y fail to find 'file1' and then both create it, then the fact that the timestamp is not changed by whichever attempt is second has no bearing on the fact that both X and Y will then go on to send an e-mail. (Or, you can remove the race.)

      - tye        

        both X and Y will then go on to send an e-mail.

        No. They won't. Because the email isn't sent until an hour later.

        Are your eyes okay?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I again ran into a directory with way too many files it in

        So, don't put time-critical files in huge directories.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://862683]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (10)
As of 2014-12-27 21:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls