Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
My expectation is that most databases would use a well-known datastructure (such as a BTree) to store this kind of data. Which avoids a million directory entries, and also allows for variable length data. I admit that an RDBMS might do this wrong. But I'd expect most of them to get it right first try. Certainly BerkeleyDB will.

As for the "file with big holes" approach, only some filesystems implement that. Furthermore depending on how Perl was compiled and what OS you're on, you may have a fixed 2 GB limit on file sizes. With real data, that is a barrier that you're probably not going to hit. With your approach, the file's size will always be a worst case. (And if your assumption on the size of a record is violated, you'll be in trouble - you've recreated the problem of the second situation that you complained about in point 1.)

I'd also be curious to see the relative performance with real data between, say, BerkeleyDB and "big file with holes". I could see it coming out either way. However I'd prefer BerkeleyDB because I'm more confident that it will work on any platform, because it is more flexible (you aren't limited to numerical offsets) and because it doesn't have the record-size limitation.

In reply to Re^2: Combining Ultra-Dynamic Files to Avoid Clustering (Ideas?)( A DB won't help) by tilly
in thread Combining Ultra-Dynamic Files to Avoid Clustering (Ideas?) by rjahrman

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (1)
As of 2021-11-30 03:06 GMT
Find Nodes?
    Voting Booth?

    No recent polls found