How's mldbm with concurrency?

I'm using MLDBM to store the results from a web crawler bot. These are all HOH values, not too complicated. I'm fairly certain that if I am just running one bot, all should be well -- though I'm watching out for the data structure memory size restriction mentioned in the bugs area of MLDBM, just hoping this won't be a problem.

But what happens if I run multiple crawlers, all writing their data to the same file? Assuming no data conflicts -- such as updating the same hash key with different data -- am I going to be okay?

If there is data conflict, does MLDBM give you feedback that there was a conflict, or just update with the first value then the second value, or what?

Data-structure wise, I don't think I need an RDBMS system. But concurrency wise?

If it seems MLDBM is going to give me trouble, can someone suggest an alternative that handles concurrency with hash serialization better?

UPDATE: Followup-ish post at Can I serialize an object and have it remember it's type?

    As I understand it, MLDBM is just a way of invoking a serializer, it has nothing to do with concurrency. Using flock() or some other locking scheme on the file seems to be the way to go. One option is to use DBD::DBM (part of the DBI distribution), which provides a DBI interface to MLDBM on top of BerkeleyDB or othe DBM file and it provides flocking (on a separate lockfile, not on the DBM file) if you want it.
by perrin (Chancellor) on Mar 15, 2005 at 18:10 UTC
by merlyn (Sage) on Mar 15, 2005 at 20:23 UTC

Node Type: perlquestion [id://439668]
Approved by thor
