Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: avoiding a race

by BrowserUk (Pope)
on Sep 28, 2010 at 16:10 UTC ( #862459=note: print w/ replies, xml ) Need Help??


in reply to avoiding a race

A simpler, non-locking mechanism would be:

  1. You receive error 123. You stat for a file named ERROR.123.
  2. If the file doesn't exist you create it (empty), and move on.
  3. If the file does exist, you check the time stamp.
    1. If it is older than 1 hour: you delete the file; send an email; then move on.
    2. If it is less than 1 hour; you just move on.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: avoiding a race
Download Code
Re^2: avoiding a race (does lock, still racy)
by tye (Cardinal) on Sep 28, 2010 at 16:31 UTC

    That is not a non-locking mechanism. It just hands off the locking to the kernel which locks the directory when it reads from it or writes to it. It has the advantage of the kernel locking implementation being very well tested.

    Of course, errors might not all have such nice, unique, numeric identifiers so the files might have to be named more like "ERROR.Error inserting record into table WIDGET, unique key violation on column UPC". And even that won't work if the comparison for "same error" isn't easily reduced to "string equality".

    But, most importantly, your solution (as described) has a race condition between stat and creating a file. You can probably fix that a couple of different ways.

    - tye        

      That is not a non-locking mechanism. It just hands off the locking to the kernel which locks the directory when it reads from it or writes to it. It has the advantage of the kernel locking implementation being very well tested.

      The kernel is going to do it's locking whatever file operations you do. Re-using it is good.

      So, I guess you could call it a "no-extra, no-effort(or risk of getting it wrong)" locking mechanism.

      Of course, errors might not all have such nice, unique, numeric identifiers ...

      If you can't reduce the errors to something easily comparible in the filesystem, you'll have similar problems locating similar errors in the file itself. And globbing is capable of much more that just "string equality".

      But, most importantly, your solution (as described) has a race condition between stat and creating a file. You can probably fix that a couple of different ways.

      If I knew how to do open(CREATE_NEW(*)) in Perl, I would suggest that. If the open() fails, it must have 'just' been created, so there's nothing else to do, so you just move on anyway.

      But realistically, it's probably a "problem" not worth the effort of solving. The idea is to avoid 300 emails. Getting 2 or even 3 shouldn't be a problem.

      Update: The "race condition", whether this process creates a new file; or some other process does it for you a few milliseconds before you do, doesn't trigger extra emails.

      Nor does it delay their being sent at the appropriate time. the time window is probably less than the resolution of the file system timestamps. So. NO race condition!

      Very simple. Very effective. Perfection is the enemy of "good enough".

      (*)Ie. Create a new file; fail if it already exists.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        If I knew how to do open(CREATE_NEW(*)) in Perl,

        Assuming NFS isn't involved, you could do that with sysopen (or IO::File):-

        use Fnctl; # O_* constants sysopen(FH, $file, O_CREAT|O_EXCL);

            --k.


        Actually, the Linux kernel holds a mutex (exclusive not shared lock) against the directory during readdir (for example) so re-using the kernel locking means that the processes must do their searching for a matching error (file) in single-file. This is probably part of why directories containing a large number of files are notoriously extremely slow in Unix. (And Perl's glob / readdir is notorious for at least sometimes being pathologically slow under Windows but I don't know what locking Windows uses for directory operations.)

        And so the "no extra locking" claim is completely bogus. The extra locking case is "a ton of kernel mutex uses" not the "small number of kernel mutex uses to open a file and then get one shared lock".

        I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there. And you may not care about a few extra e-mails (or tons of extra e-mails when the directory gets bloated and pathologically slow to use) but the person asking the "avoiding a race" question probably does. I can certainly understand caring about my boss getting duplicate e-mails after he assigned me the task of making sure we don't get duplicate e-mails.

        But given 300 processes, I'd probably go with a separate, single process that de-dups errors rather than having 300 processes fighting over the list of errors (whether stored as lines in a file or files in a directory).

        - tye        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://862459]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2014-12-25 14:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (160 votes), past polls