Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Store a huge amount of data on disk

by Sewi (Friar)
on Oct 18, 2011 at 15:31 UTC ( [id://932175]=perlquestion: print w/replies, xml ) Need Help??

Sewi has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I need to store a huge amount of data having a fixed structure:

  • Each item has a unique (alphanummeric, 7-bit-ASCII) id
  • A fixed number of "meta" information fields contain numbers or text data up to 100 bytes (worst case, usually <30 bytes)
  • meta information won't change once the item has been created
  • Each item has two text parts usually 2-16k in size, somethimes some MB, but up to 2 GB have to be supported
  • The text parts are delivered in blocks up to a predefined size limit (currently about 16 MB, but may be changed to anything from ~1k if storage requires a change), currently typically 1900 bytes
  • The final text part size is unknown, same for the number of blocks
  • The blocks may not arrive in sequential order, but they contain a sequence number starting from zero for each item, every sequence number is used
  • Up to 10 mio. items should be stored at the same time, maybe more in the future
  • About 90% of the items may be deleted some weeks after they were created
  • Some of the remaining are deleted later, few are kept forever
  • Each item must be accessible quickly by unique item id
  • Deletion of items may be really slow
  • I considered using MongoDB, but it's becoming slow for 15+ mio. items and has a 16 MB limit per item. mySQL can't handle this amount, too. I'd like to store the stuff in files, but avoid one file per item as these many files are hard to handle for filesystems.

    I considered tie and GDBM_File which is rock solid on reading, I could store many items in one file, delete them and append/insert text blocks as they are arriving, but GDBM is critical when more than one process is writing the same file and I'm not sure that no two process will ever write the same file as new text blocks are arriving for different messages.

    Any suggestions?

    Replies are listed 'Best First'.
    Re: Store a huge amount of data on disk
    by erix (Prior) on Oct 18, 2011 at 15:43 UTC

      You omit a crucial piece of information: what is the typical query? What (how much) does it retrieve and how fast does it need to be? ('quickly' is not very informative...)

      I don't see anything in the specification that rules out the most obvious solution, PostgreSQL.

      Update: text values in postgres do have a limit of 1 GB (see the manual).

        Typical query is "one item by id", no other queries than "by id" are required.

        The deletion cronjob may crawl through all objects to find deletion candidates.

        Do you think Postgres would handle that amount of data? I used it for a analysis of some million shorter records (mysql profiling data :-) ) lately and felt like it got slower when importing/dealing with a great number of rows. I'll try...

          Whether it is fast enough depends, I think, as much on the disks on your system as on the software that you'll use to write to them.

          From what you mentioned I suppose the total size to be something like 300 GB? It's probably useful/necessary (for postgres, or any other RDBMS) to have some criterium (date, perhaps) by which to partition.

          (FWIW, a 40 GB table that we use intensively, accessed by unique id, gives access times of less than 100 ms. System has 32 GB, and a 8-disk raid10 array.)

          Btw, postgresql *does* have a limit for text column values (1 GB, where you need 2 GB, but I suppose that could be avoided by splitting the value or something like that)

    Re: Store a huge amount of data on disk
    by BrowserUk (Patriarch) on Oct 18, 2011 at 15:37 UTC
      Each item has a unique (alphanummeric, 7-bit-ASCII) id

      How long? (Ie. What range?)

        About 16 to 32 bytes, any limit >= 16 bytes would be ok and may still be applied.

        I should be able to switch this into a 64 bit integer number if required, but I prefer the current alpha ids.

          Sounds like you're indexing your data by a hex-encoded digest?

          Given that you have 3 variable & possible huge sized chunks -- which most RDBMSs handle by writing the filesystem anyway -- associated with each index key, and your selection criteria are both fixed & simple, I'd use the filesystem.

          Subdivide the key into chunks that make individual directories contain at most a reasonable number of entries and then store the 3 sections in files at the deepest level.

          By splitting a 32-byte hex digest into 4-char chunks, no directory has more than 256 entries. The file-system cache will cache the lower levels and the upper levels will be both fast to read from disk and quick to search. Especially if your file-system hashes its directory entries.

          I'd write the individual chunks of the two text parts in separate files unless they will always be loaded as a single entity, in which case it might be slightly faster to concatenate them.

          Overall, given a digest of 8fbe7eb8c04c744406cca0aeb67e4f7f, I'd lay the directory structure out like this:

          /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/meta.txt /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.000 /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.001 /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.002 /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.... /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text2.000 /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text2.001 /data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text2....

          With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
          Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
          "Science is about questioning the status quo. Questioning authority".
          In the absence of evidence, opinion is indistinguishable from prejudice.

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Node Status?
    node history
    Node Type: perlquestion [id://932175]
    Approved by BrowserUk
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others taking refuge in the Monastery: (2)
    As of 2024-04-25 05:24 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found