Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Safe to open+close for every write, without locking?

by graff (Chancellor)
on Dec 21, 2012 at 06:20 UTC ( [id://1009873]=note: print w/replies, xml ) Need Help??


in reply to Safe to open+close for every write, without locking?

If I understand the OP code correctly, it seems like the lines being written to the one output file are all quite short - just two numerics that yield a maximum of 10 characters per line (counting the final line-feed - 11 characters if you're on a CRLF system).

The problem I would want to check for is whether multiple competing processes, writing to the same output file, might interrupt each other if one or more of them were trying to write relatively long lines. I've seen this happen, and it makes the resulting file incomprehensible and unparsable.

I'd be inclined to go with something like BrowserUK's suggestion, but if you want to pursue the OP strategy, you should test again by writing, say, 130 characters or more per line; follow the same tactic of starting each line with a token that is different for each process, and see whether you get the expected number of lines starting with those tokens, as opposed to things like this:

PROC.1 This line is being written by process #1. PROC.2 This line is being writtPROC.1 This line is being written by pr +ocess #1. en by process #2. PROC.3 This line is beinPROC.4 This line is being written by process # +4. g written by pPROC.1 This line is being written by process #1. rocess #3.
(... and so on) I'm using fewer than 130 characters per line there, but I hope you get what I'm talking about.

Replies are listed 'Best First'.
Re^2: Safe to open+close for every write, without locking?
by sedusedan (Monk) on Dec 21, 2012 at 08:06 UTC

    I don't see how writing 10-11 bytes twelve times won't cause roughly as much clobbering as writing 130 bytes once, but I did the test anyway.

    A new script, write2.pl, to include several mechanisms:

    #!/usr/bin/perl use strict; use warnings; use autodie; use Bench; use Fcntl qw(:flock); unless (@ARGV == 3) { die "Usage: $0 <method> <path> <str>\n"; } my ($method, $file, $str) = @ARGV; bench sub { if ($method eq 'print') { open my($fh), ">>", $file; for (1..100_000) { print $fh "$str: $_\n"; } close $fh; } elsif ($method eq 'open+print+close') { for (1..100_000) { open my($fh), ">>", $file; print $fh "$str: $_\n"; close $fh; } } elsif ($method eq 'seek+print') { open my($fh), ">>", $file; for (1..100_000) { seek $fh, 0, 2; print $fh "$str: $_\n"; } close $fh; } elsif ($method eq 'flock+print') { open my($fh), ">>", $file; for (1..100_000) { flock $fh, LOCK_EX; print $fh "$str: $_\n"; flock $fh, LOCK_UN; } close $fh; } };

    Here's what the lines that get written should look like for process number 1:

    0010010010010010010010010010010010010010010010010010010010010010010010 +01001001001001001001001001001001001001001001001001001001001001: 1 0010010010010010010010010010010010010010010010010010010010010010010010 +01001001001001001001001001001001001001001001001001001001001001: 2 ...

    For process number 2:

    0020020020020020020020020020020020020020020020020020020020020020020020 +02002002002002002002002002002002002002002002002002002002002002: 1 0020020020020020020020020020020020020020020020020020020020020020020020 +02002002002002002002002002002002002002002002002002002002002002: 2 ...

    $ for i in `seq -f "%03g" 1 100`;do ( ./write2.pl print log $i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$$i$i$i$i & );done

    Clobbering. Each process finishes in +- 7-9 secs.

    $ for i in `seq -f "%03g" 1 100`;do ( ./write2.pl open+print+close log $i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$$i$i$i$i & );done

    No clobbering. Each process finishes in +- 75 secs.

    $ for i in `seq -f "%03g" 1 100`;do ( ./write2.pl seek+print log $i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$$i$i$i$i & );done

    No clobbering. Each process finishes in +- 30 secs.

    $ for i in `seq -f "%03g" 1 100`;do ( ./write2.pl flock+print log $i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$$i$i$i$i & );done

    No clobbering. Each process finishes in +- 450 secs.

    I'm guessing that the flock+print is the safest and most portable, but performance gets worse as the number of concurrent writers gets larger. For now I'm leaning towards the safest method, but will look into other possibilities in the future. What I would like to know is how safe the other methods are (open+print+close and seek+print) in avoiding clobbering.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1009873]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2024-04-26 02:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found