http://www.perlmonks.org?node_id=932181


in reply to Store a huge amount of data on disk

You omit a crucial piece of information: what is the typical query? What (how much) does it retrieve and how fast does it need to be? ('quickly' is not very informative...)

I don't see anything in the specification that rules out the most obvious solution, PostgreSQL.

Update: text values in postgres do have a limit of 1 GB (see the manual).

  • Comment on Re: Store a huge amount of data on disk

Replies are listed 'Best First'.
Re^2: Store a huge amount of data on disk
by Sewi (Friar) on Oct 18, 2011 at 15:52 UTC

    Typical query is "one item by id", no other queries than "by id" are required.

    The deletion cronjob may crawl through all objects to find deletion candidates.

    Do you think Postgres would handle that amount of data? I used it for a analysis of some million shorter records (mysql profiling data :-) ) lately and felt like it got slower when importing/dealing with a great number of rows. I'll try...

      Whether it is fast enough depends, I think, as much on the disks on your system as on the software that you'll use to write to them.

      From what you mentioned I suppose the total size to be something like 300 GB? It's probably useful/necessary (for postgres, or any other RDBMS) to have some criterium (date, perhaps) by which to partition.

      (FWIW, a 40 GB table that we use intensively, accessed by unique id, gives access times of less than 100 ms. System has 32 GB, and a 8-disk raid10 array.)

      Btw, postgresql *does* have a limit for text column values (1 GB, where you need 2 GB, but I suppose that could be avoided by splitting the value or something like that)

        Thank you for that numbers. A 1 GB upper limit would be ok, too as we don't want to reach this limit, but it might happen. I expect that I need to split at some high limit anyway, 1 GB or 2 GB doesn't matter.