Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Strange IO + concurrency issue

by davido (Archbishop)
on Sep 28, 2013 at 16:39 UTC ( #1056150=note: print w/ replies, xml ) Need Help??


in reply to [SOLVED] Strange IO + concurrency issue

Look more closely at how you're obtaining a lock:

  1. You are opening a lock to a semaphore file (lock.tmp).
  2. Once that lock is obtained you are opening an output file (somefile$_.tmp).
  3. You unlock/close your semaphore file.
  4. You write to somefile$_.tmp
  5. You close somefile$_.tmp.
  6. You copy from your tmp file to a new file.

Steps 4, 5, and 6 are unprotected.

You should not be releasing your semaphore file lock until after you're done writing to somefile$_.tmp, and in fact, probably not even until after the copy operation. And probably shouldn't be using LOCK_UN as a matter of habit (simply closing a filehandle will terminate the lock, and explicitly unlocking before closing can create a race condition, though not in this usage.)

You may already know about this set of slides from Mark Jason Dominus, but just in case: File Locking Tricks and Traps.


Dave


Comment on Re: Strange IO + concurrency issue
Select or Download Code
Re^2: Strange IO + concurrency issue
by vsespb (Hermit) on Sep 28, 2013 at 17:11 UTC
    1. You are opening a lock to a semaphore file (lock.tmp).
    2. Once that lock is obtained you are opening an output file (somefile$_.tmp).
    3. You unlock/close your semaphore file.
    4. You write to somefile$_.tmp
    5. You close somefile$_.tmp.
    6. You copy from your tmp file to a new file.
    Steps 4, 5, and 6 are unprotected.
    I knew that if lock extended to other steps it works, but I could not understand why.

    The thing is: When I open file on step (2), I write to it in step (4). But steps (4) and (5) are actually protected. Steps (4) and (5) is performed only if that process was the one who created file. Otherwise step is skipped. (and this was intentional - I was trying to minimize lock time which was important)

    Notice "if ($f)"
    if ($f) { print ($f "x") for (1..40_000_000); close $f; }
    Problem was that step (6) unprotected. And this assertion was wrong:
    die if -s $filename != -s $newfilename;
    Correct assertion would be:
    die if 40_000_000 != -s $newfilename;
    i.e. when data copied, file could be in the middle of creation by another process. so in the end I see:
    correct size of $filename
    wrong size of $newfilename
    assertion passed: if -s $filename != -s $newfilename;

    Thank you! I'll mark post as SOLVED
      Notice "if ($f)"

      I suspect this test does not do what you think it does. In particular, consider the case that one process opens the file and writes to it, then another process opens the same file and, perhaps, writes to it, then the if ($f) test executes in the first process. Is the result any different because of what the other process did? I think you will find it is not and, therefore, that the statements in the if block are not as protected as you think they are.

      Whether your problem is solved depends on what you are trying to do, which you don't say, but what you are calling a solution seems strange to me. I suggest you reconsider.

      Running your program on Windows several times, each trial I get a variable subset of the 25 potential output files with names beginning with 'c' and several of them have less than 40_000_000 characters in them.

      1. file size of some c_* files is less than 40000000, however this should not happen, because of line: die if -s $filename != -s $newfilename;

      The die will not prevent the production of files with less than 40_000_000 characters because the test is performed after the copy, not before. Even if you moved the test before the copy, because these statements are not protected by the lock file, the test could pass but another process could modify the source file before the copy executes, resulting in a file of different length (and, perhaps, content if in your real program the different processes write different content).

      Changing the test on the die to 40_000_000 != -s $newfilename makes little difference to the outcome: there is still a variable subset of the possible 25 copied files and there are still several of them with fewer than 40_000_000 characters. Of course, it makes a difference as, when the test runs, the size of $filename might be other than 40_000_000. This will be somewhat random, depending on how execution of the various processes is interleaved. But, perhaps this is exactly as you intend and I am worrying for nothing.

        I suspect this test does not do what you think it does. In particular, consider the case that one process opens the file and writes to it, then another process opens the same file and, perhaps, writes to it, then the if ($f) test executes in the first process. Is the result any different because of what the other process did? I think you will find it is not and, therefore, that the statements in the if block are not as protected as you think they are.
        Notice also "my $f = undef" and "unless (-e $filename)"
        my $f=undef; getlock sub { unless (-e $filename) { open ($f, ">", $filename) or confess; binmode $f; } }; if ($f) {
        ( I admit that this code is unclear )
        Whether your problem is solved depends on what you are trying to do, which you don't say, but what you are calling a solution seems strange to me.
        That was proof of concept code, it contained a bug. Bug found, thus solved.
        what you are trying to do, which you don't say
        This was proof-of-concept code. I.e. I simplified my 1000 lines program to this code. Thus what I am trying to do is behind the scene, and we should focus only on technical part of problem, not business requirements.
        If I post my original code, this would require at least couple of hours to set up things and reproduce the problem for anyone who tries to.
        the test could pass but another process could modify the source file before the copy executes
        Yes, but btw there are no concurrent writes to same file.
        Changing the test on the die to 40_000_000 != -s $newfilename makes little difference to the outcome: there is still a variable subset of the possible 25 copied files and there are still several of them with fewer than 40_000_000 characters.
        Well, it indeed does not fix program to output correct files, but it causes some processes to DIE, thus eliminating a bug. See assertions.

        Actual fix is to extend lock to whole program, until copy is done (as other posters suggested). Or even just extend it to the point when write to source file is done and file is closed.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056150]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (19)
As of 2014-08-22 19:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (163 votes), past polls