open for append

perldoc -f flock recommends (indirectly, in its code example) that files that are opened-for-append should be flock()ed before being written to. A few years ago, I was horrified to see a peer writing to a log file (which was opened for append), without bothering to lock it first. It was especially troubling, knowing that the log file in question was to be later used to audit the effectiveness of the program.

Strangely, my fellow programmer was not moved. I tried writing some test code to demonstrate the need for concern. I wasn't able to simulate a problem. I wrote to comp.lang.perl.misc, to seek advice. I learned that the POSIX C library guarantees that, when multiple processes are writing to the same file, and all of them have the file opened for appending, data shall not be overwritten.

Recently I saw Stas Bekman give his mod_perl 2.0 by example talk at SPUG. I saw a similar code example in one of his slides, flock()ing a filehandle before writing to it, when it had been opened for append.

This got me thinking again about it, about about how Perl interacts with libc. I read in the SuSE Linux man page for open(2) that trying this trick on an NFS mounted file system may lead to corruption. Fare enough.

I wrote a test program that takes three arguments, the total number of children to run, the number of lines each child should write to the logfile, and the number of children to run concurrently. It forks a bunch of children, which open a log file for appending, and write a bunch of lines to it. Then the parent reaps the kids, and counts what it finds in the log file. There are three simple tests: that the number of lines is what's expected, that the number of bytes is what's expected, and that the lines each have the expected number of bytes on them.

Under SuSE linux, one cpu (but a multithreaded pentium4, for whatever that's worth), I could make the last test fail, if I left Perl up to its usual IO buffering tricks. But if I turned on autoflush, then I could not make that test fail. I wonder if that test will fail on multi cpu systems. I'll give it a try tomorrow.

I'm curious to see results from other systems. I wonder what other sorts of things might cause corruption, even if it's just messing up the lines. I wonder how does this works on other OSs, like Windows or Cygwin.

It's also quite interesting to compare the execution time for different numbers of concurrent children, and the difference in speed behavior when the children have autoflush turned on or not.

-Colin.

Update: changed "data shall not be lost" to "data shall not be overwritten".

#!/usr/local/bin/perl

use strict;
use warnings;
use POSIX;
use Test::More tests => 3;


#---------------------------
# behavioral parameters and globals


use constant DEBUG              => 0;


use constant FILE_NAME          => '/tmp/perl_append_test';
use constant CHILD_REPORT_POINT => 100;
use constant LOG_FORMAT         => "[%6d] %6d %6d\n";


# try changing this to zero to see line size discrepansies
use constant FLUSH_BUFFER => 1;


my $NUM_CHILDREN = shift || 801;
my $NUM_LINES    = shift || 511;
my $MAX_CHILDREN = shift || 100;
my %kids;
my $child_count = 0;

#---------------------------
$|++;

unlink FILE_NAME;

print "# children: $NUM_CHILDREN\n" if DEBUG;
print "# lines:    $NUM_LINES\n\n" if DEBUG;

print "# spawning ... \n" if DEBUG;
print "# parent pid: $$\n" if DEBUG;

spawn_kids();

clean_up_after_kids();


#---------------------------
# make a report
#---------------------------

my $expected_line_length = length( sprintf( LOG_FORMAT, 0, 0, 0 ) );

print "\n# counting money ...\n\n" if DEBUG;
my ($lines, $bytes);
my $discrepansies = 0;

open my $log, '<', FILE_NAME
  or die "$$ couldn't open log: $!";

while ( <$log> ) {
  $lines++;
  $bytes += my $l = length;
  $discrepansies++ if $l != $expected_line_length;
}

close $log;

is( $lines, $NUM_CHILDREN * $NUM_LINES,
    'number of lines in log matches expectation' );
is( $bytes, $NUM_CHILDREN * $NUM_LINES * $expected_line_length,
    'bytes in logfile matches expectation' );
is( $discrepansies, 0, 'no line size discrepansies' );

if ( DEBUG ) {
  print "\n";
  print "# lines: $lines\n";
  print "# bytes: $bytes\n";
  print "# discrepansies: $discrepansies\n";
  print "\n";
}

exit;

#
#-----------------------------------------------------------

sub spawn_kids {
  for my $child_num ( 1..$NUM_CHILDREN ) {
    ++$child_count;

    print "\r# concurrent children: $child_count        " if DEBUG;
    print "\n# child count: $child_num\n"
      if !($child_num % CHILD_REPORT_POINT) && DEBUG;

    reap_some();
    while ( $child_count >= $MAX_CHILDREN ) {
      reap_some();
      sleep 1
    }

    my $p = fork();
    die "couldn't fork: $!" unless defined $p;
    if ( $p ) {
      $kids{ $p }++;
    }
    else {
      # I'm the child
      append_like_hell( $child_num );
      exit;
    }
  }
  print "\n" if DEBUG;
}
sub clean_up_after_kids {
  print "# reaping ... \n" if DEBUG;

  while ( keys %kids ) {
    reap_some();
    print "\r# concurrent children: $child_count        " if DEBUG;
    sleep 1;
  }
  print "\n" if DEBUG;
}


#---------------------------
# find dead kids
#---------------------------
sub reap_some {
  while(( my $pid = waitpid(-1, POSIX::WNOHANG)) > 0) {
    --$child_count;
    delete $kids{ $pid };
  }
}


#---------------------------
# this is what the kids do
#---------------------------
sub append_like_hell {
  my $id = shift;
  open my $log, '>>', FILE_NAME
    or die "[$id] $$ couldn't open log: $!";

  select( ( select( $log ), $|++ )[0] ) if FLUSH_BUFFER;

  for ( 1..$NUM_LINES ) {
    print $log sprintf(LOG_FORMAT, $id, $$, $_)
      or die "[$id] $$ couldn't print line $_: $!";
  }
  close $log or die "[$id] $$ couldn't close log: $!";
}
[download]

WHITEPAGES.COM | INC

Back to Meditations