Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: avoiding a race (does lock, still racy)

by tye (Cardinal)
on Sep 28, 2010 at 16:31 UTC ( #862463=note: print w/ replies, xml ) Need Help??


in reply to Re: avoiding a race
in thread avoiding a race

That is not a non-locking mechanism. It just hands off the locking to the kernel which locks the directory when it reads from it or writes to it. It has the advantage of the kernel locking implementation being very well tested.

Of course, errors might not all have such nice, unique, numeric identifiers so the files might have to be named more like "ERROR.Error inserting record into table WIDGET, unique key violation on column UPC". And even that won't work if the comparison for "same error" isn't easily reduced to "string equality".

But, most importantly, your solution (as described) has a race condition between stat and creating a file. You can probably fix that a couple of different ways.

- tye        


Comment on Re^2: avoiding a race (does lock, still racy)
Re^3: avoiding a race ("No extra", "no-user" locking--miniscule race of no importance)
by BrowserUk (Pope) on Sep 28, 2010 at 17:28 UTC
    That is not a non-locking mechanism. It just hands off the locking to the kernel which locks the directory when it reads from it or writes to it. It has the advantage of the kernel locking implementation being very well tested.

    The kernel is going to do it's locking whatever file operations you do. Re-using it is good.

    So, I guess you could call it a "no-extra, no-effort(or risk of getting it wrong)" locking mechanism.

    Of course, errors might not all have such nice, unique, numeric identifiers ...

    If you can't reduce the errors to something easily comparible in the filesystem, you'll have similar problems locating similar errors in the file itself. And globbing is capable of much more that just "string equality".

    But, most importantly, your solution (as described) has a race condition between stat and creating a file. You can probably fix that a couple of different ways.

    If I knew how to do open(CREATE_NEW(*)) in Perl, I would suggest that. If the open() fails, it must have 'just' been created, so there's nothing else to do, so you just move on anyway.

    But realistically, it's probably a "problem" not worth the effort of solving. The idea is to avoid 300 emails. Getting 2 or even 3 shouldn't be a problem.

    Update: The "race condition", whether this process creates a new file; or some other process does it for you a few milliseconds before you do, doesn't trigger extra emails.

    Nor does it delay their being sent at the appropriate time. the time window is probably less than the resolution of the file system timestamps. So. NO race condition!

    Very simple. Very effective. Perfection is the enemy of "good enough".

    (*)Ie. Create a new file; fail if it already exists.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      If I knew how to do open(CREATE_NEW(*)) in Perl,

      Assuming NFS isn't involved, you could do that with sysopen (or IO::File):-

      use Fnctl; # O_* constants sysopen(FH, $file, O_CREAT|O_EXCL);

          --k.


      Actually, the Linux kernel holds a mutex (exclusive not shared lock) against the directory during readdir (for example) so re-using the kernel locking means that the processes must do their searching for a matching error (file) in single-file. This is probably part of why directories containing a large number of files are notoriously extremely slow in Unix. (And Perl's glob / readdir is notorious for at least sometimes being pathologically slow under Windows but I don't know what locking Windows uses for directory operations.)

      And so the "no extra locking" claim is completely bogus. The extra locking case is "a ton of kernel mutex uses" not the "small number of kernel mutex uses to open a file and then get one shared lock".

      I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there. And you may not care about a few extra e-mails (or tons of extra e-mails when the directory gets bloated and pathologically slow to use) but the person asking the "avoiding a race" question probably does. I can certainly understand caring about my boss getting duplicate e-mails after he assigned me the task of making sure we don't get duplicate e-mails.

      But given 300 processes, I'd probably go with a separate, single process that de-dups errors rather than having 300 processes fighting over the list of errors (whether stored as lines in a file or files in a directory).

      - tye        

        I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there.

        Hm. The window your "still racy" referred to, was the time between a failed stat, and open. Ie:

        open ERROR, '>', $file unless -e $file; close ERROR;

        And even in a full directory and an a loaded system, that time is going to be measured--assuming you can actually measure it at all--in low milliseconds at the most.

        Now, what does that actually mean?

        It means that one of the other processes encountered the same error as you, and succeeded in creating the error file within those few milliseconds. So what?

        You then immediately overwrote it with a later time-stamp. But still, so what?

        Nothing! Because the error file got created. Nothing is going to take any action--like sending emails--as a result of that files creation for another hour. From the OP:

        if it has been encountered before and the time stamp is greater than an hour ago it will mail the admin

        So the very worst affect of some other process creating the file instead of you, is that the sending of the email is delayed by the difference between the original time-stamp, and the new one. And that's just a few millseconds at most.

        #! perl -slw use Time::HiRes qw[ time ]; use threads qw[ stack_size 4096 ]; my $file = 'theFile'; async{ 1 until -e $file; }->detach for 1 .. 300; sleep 3; my @times = time; unless( -e $file ) { push @times, time; open FILE, '>', $file or die "$file : $!"; push @times, time; close FILE; } push @times, time; print for @times; unlink $file; __END__ [20:51:05.40] C:\test>junk49 1285789900.703 1285789900.87509 1285789901.02394 1285789901.324

        With 300 clients stating theFile (in a directory containing 1000 files), the window of opportunity for this irrelevant race condition is all of 300 milliseconds.

        And then only if the time-stamp resolution of the file-system is sufficient to actually discern the difference, which is unlikely.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://862463]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2014-08-02 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (54 votes), past polls