Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

DBM problem

by frechettes (Initiate)
on Feb 15, 2001 at 19:06 UTC ( #58615=perlquestion: print w/replies, xml ) Need Help??

frechettes has asked for the wisdom of the Perl Monks concerning the following question:

Each time I do a delete from a dbm, the size of the file doesn't change. This is very troublesome. I can write keys/vals to and from the script via tied hash no problem. It's after a delete of the keys that the file size is not reduced accordingly. When I do a cat on the file, I verify in fact there are no keys/vals after the delete (this is good.) However, what is being left behind?? What's in the file? Is it a permissions issue? Any help would be greatly appreciated to at least point me in the right direction.

Replies are listed 'Best First'.
Re: DBM problem
by chipmunk (Parson) on Feb 15, 2001 at 19:27 UTC
    All of your questions are answered in tied hashed and deleting keys and their valeus (sic), a recent thread on this behavior of DBM files.

    To summarize: It is an optimization that DBM files do not shrink when keys are deleted. Shuffling the bits around on disk every time a key is deleted would be too slow. Instead, the space is reused the next time a key is inserted. Read the aforementioned thread for more details.

Re: DBM problem
by merlyn (Sage) on Feb 15, 2001 at 19:24 UTC
    DBMs don't, in general, shrink. It's very difficult to give "this block" back to the O/S... only the blocks at the end. (Very similar in fact to the allocation of memory in traditional Unix.)

    If you're concerned, you can simply create a new DBM. If you've got enough memory, pull the hash into memory, close the DBM, delete the files, and reopen it and re-store, like:

    dbmopen %FOO, "my_db", 0666 or die; ... populate %FOO ... delete some stuff from %FOO { my %TEMP = %FOO; # cache it in memory dbmclose %FOO; unlink <my_db*>; # danger, but general enough {grin} dbmopen %FOO, "my_db", 0666 or die; %FOO = %TEMP; }

    -- Randal L. Schwartz, Perl hacker

      If you're concerned about disc space, though, I'm going to assume that the files are very large, ergo, unlikely to fit in memory... you'd likely want to do something like:

      dbmopen %ORIG, "original", 0666 or die; dbmopen %NOVA, "new", 0666 or die; # -- race condition start for my $key (keys %ORIG) { $NOVA{$key} = $ORIG{$key}; } dbmclose %ORIG; dbmclose %NOVA; unlink "original" link "new", "original" # race condition end unlink "new";

      So long as no-one else needs to get at your DBM while you're "shrinking" it, the above will work.

        But keys() will still build quite a large list. You may prefer while (my($k,$v) = each %hash) instead.

        japhy -- Perl and Regex Hacker
        That won't work unless you know how to turn "new" into the actual list of files involved for that database. Perhaps "new.db", or even "new.dir" and "".

        -- Randal L. Schwartz, Perl hacker

      I can't quote an exact answer or url to give you, but I experienced the same thing when I started using hash databaes for tasks. This is the PROPER behavior for a DBM, so if your worried about disk usage, revert back to a flat text file or get into something more robust like MySQL.
Re: DBM problem
by dws (Chancellor) on Feb 15, 2001 at 23:43 UTC
    If you need to your DBM to shrink, you'll need to recreate it. Write a script that pulls the (key,value) pairs out of the current DBM and inserts them into a fresh one. Invoke the script periodically, with appropriate locking.

    I used this technique on a low-traffic Wiki clone. A once-a-week scheduled rebuild usually shrank the DBM by 60%.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://58615]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2020-09-25 19:53 GMT
Find Nodes?
    Voting Booth?
    If at first I donít succeed, I Ö

    Results (140 votes). Check out past polls.