Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Interesting thoughts Xanatax,

From an academic standpoint (since I'm writing this program to learn as much as to solve a problem, that definitely makes the academic valuable to me), I like the idea of playing with different data structures--even hugely different, such as an array in memory, indexed by part of the MD5. There are at least two problems I can see with this approach (I'm not wholly illiterate on data structures and indexing, and other fun things--I am a Squid nerd, after all), and I'd welcome thoughts or pointers to docs on solving them:

We can't have 32 bytes worth of array entries, so we'd index on a part of the MD5. Even so, to guarantee uniqueness we need at least a few bytes--which leads to an extremely huge, but very sparsely populated array. Not ideal, even if memory is not an issue--I think memory would become an issue if we're using an array /that/ big. So how to handle sparse data structures of this sort?

Persistence and concurrency. We have to be able to read the data from another process (the indexing itself is not the end, it is merely the means). So how to dump this out, in near realtime, in a form that is quickly accessible to another process? We don't need to modify the data from the other process, just read it.

As for the no-FETCH, STORE everything proposal...I don't get it. I think it is missing the vital parent->child relationship component, if I understand you. I have considered deleting the parent each time and rewriting with the new child, but I suspect that is an optimization that the BDB handles already (and probably better than I can in Perl--since it is probably smart enough to know when it can insert into the existing 4kb space and when it will have to expand the space).

In reply to Re: Re: Performance quandary by SwellJoe
in thread Performance quandary by SwellJoe

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    [Eily]: that was a sneeze overpowering a cough
    [choroba]: Czechs sneeze "Hep-cheek"
    vrk adds gingerbread cookies to the platter on the sideboard.
    [erix]: I'll give that a try next time
    vrk takes a cookie from the platter on the sideboard.
    [Eily]: French sneeze "Atchoum", because we close our mouth when we're done :P
    [erix]: Hatshepsut (no sneezing audio there)
    choroba wipes the saliva from the cookies
    [LanX]: scary Le Pen + Melonchon had over 40% ...
    [vrk]: One good word for it in Finnish is pärskäys. A very wet connotation.

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (11)
    As of 2017-04-24 15:48 GMT
    Find Nodes?
      Voting Booth?
      I'm a fool:

      Results (442 votes). Check out past polls.