Dear Monks,
I have a program that's meant to stress the BerkeleyDB::Hash module to see how well it can perform with multiple, concurrent readers/writers. I am new to Perl _and_ BerkeleyDB, so go easy on me ;-)
Running the program below mostly yields the following errors:
db_put failed: DB_RUNRECOVERY: Fatal error, run database recovery ( /
+Successful return: 0)
main::spawn_child('BerkeleyDB::Env=ARRAY(0x82e5248)','BerkeleyDB::
+Hash=ARRAY(0x80639b4)') called at douji.pl line 115
main::spawn_child('BerkeleyDB::Env=ARRAY(0x82e5248)','BerkeleyDB::
+Hash=ARRAY(0x80639b4)') called at douji.pl line 115
One instance of the above for each child process. Sometimes I get some errors like this, too:
db_put failed: Cannot allocate memory (Cannot allocate memory / reallo
+c: Cannot allocate memory: 4294967190)
main::spawn_child('BerkeleyDB::Env=ARRAY(0x82e5248)','Berkeley
+DB::Hash=ARRAY(0x80639b4)') called at douji.pl line 115
Running db_recover does not allow it to then finish correctly the next time, however.
Does anyone have any ideas what I'm doing wrong? I was under the impression that using DB_INIT_CDB was enough to protect against concurrent writes, but do I need transactions instead? I tried using BerkeleyDB's transaction stuff earlier and that failed with the same error messages.
TIA :)
P.S. I am using BerkeleyDB CPAN module version 0.28 on RedHat 9 using the db4 4.0.14 packages.
#!/usr/bin/env perl
# vim:set ts=4 sw=4 ai:
use strict;
use warnings;
use Carp;
use POSIX ":sys_wait_h";
use Data::Dumper;
use lib '/junk/lib';
use BerkeleyDB;
use BerkeleyDB::Hash;
use Time::HiRes qw(gettimeofday tv_interval);
use constant RUNS => 1000000;
use constant DBPATH => '/junk/test.db';
use constant NUM_CHILDREN => 10;
$|++;
# print backtrace info
sub bt { croak "DIED ($.)\n"; }
# generate random integer
sub rand_num { 4278190080 + int(rand(1048576)); }
# read an entry from a BDB
sub read_db {
my ($db, $key) = @_;
my $val;
$db->db_get($key, $val);
return (defined($val)) ? eval($val) : {};
}
# increment an entry in a BDB, inserting if not there
sub write_db {
my ($db, $key, $stuff) = @_;
my $val = read_db($db, $key);
$val->{timestamp} = time;
$val->{events}++;
$val->{stuff}++ if ($stuff);
my $str = Dumper(\$val) or croak "failed to serialize row data: $!
+\n";
return $db->db_put($key, $str);
}
# fork off a worker process that will mutate the BDB independently of
# the others
sub spawn_child {
my ($env, $db) = @_;
my $pid = fork;
croak "Could not fork: '$!'\n" unless (defined($pid));
return unless ($pid == 0);
# child process starts here
my ($start, $finish, $key, $ip, $rv, $val, $reads, $writes, $delet
+es);
print "Staring worker $$...\n";
# perform a number of operations using BDB transactions
$start = [ gettimeofday ];
for (1..RUNS) {
$key = int(rand(101));
$ip = rand_num();
if ($key <= 49) {
# do random update/insert
$rv = write_db($db, $ip, ($key % 2 == 0));
croak "db_put failed: $rv ($! / $BerkeleyDB::Error)\n"
unless ($rv == 0);
$reads++; $writes++;
} elsif ($key <= 98) {
# do random read
read_db($db, $ip); $reads++;
} else {
# do random delete
$db->db_del($key); $deletes++;
}
}
$finish = [ gettimeofday ];
# print summary info for timing
my $sum = ($reads || 0) + ($writes || 0) + ($deletes || 0);
my $elapsed = tv_interval($start, $finish);
printf "$$: %d reads, %d writes, %d deletes (%d total) in %0.2f se
+conds: %0.2f ops/sec\n",
($reads || 0),
($writes || 0),
($deletes || 0),
$sum,
$elapsed,
$sum / $elapsed;
exit 0;
}
# install signal handlers
$SIG{__WARN__} = \&bt;
# create BDB environment so we can enable transactions
my $env = new BerkeleyDB::Env(
-Cachesize => 4 * 1024 * 1024,
-Flags => (DB_CREATE() |
DB_INIT_CDB() |
DB_INIT_MPOOL()),
-SetFlags => DB_CDB_ALLDB())
or croak "Failed to create DB env: '$!' ($BerkeleyDB::Error)\n";
# open database
my $db = new BerkeleyDB::Hash(
-Filename => DBPATH,
-Env => $env,
-Flags => DB_CREATE())
or croak "Failed to open DBPATH: '$!' ($BerkeleyDB::Error)\n";
# spawn concurrent DB mutators and just wait for them to finish
for (1..NUM_CHILDREN) { spawn_child($env, $db); }
for (1..NUM_CHILDREN) { my $pid = wait; }