Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: Large chunks of text - database or filesystem?

by jhourcle (Prior)
on Mar 19, 2005 at 15:34 UTC ( #440935=note: print w/replies, xml ) Need Help??

in reply to Large chunks of text - database or filesystem?

You are still missing too much information to make a decision

  1. What is the maximum posts per minute you expect?
  2. What is the maximum reads per minute?
  3. How many overall posts per day? (estimated growth rate)
  4. How large do you expect the messages to be?
  5. Is it a write once system, or will there be re-editing of messages?
  6. What is your hardware budget for the project, or is there fixed hardware?
  7. What is the required uptime?
  8. Are you going to have an internal search engine?
  9. If so, what sort of information are you going to search on? (metadata, or the message itself?)
  10. What are your disaster recovery requirements?
  11. Do you need to support transactional concurency?
  12. What are your time constraints?
  13. Do you already have a database to use for this purpose?
  14. Do you already have experience with databases?

Moving lots of files around is not a problem. Tar and rsync are your friends. The only problem with files comes when you're trying to work with more files at the same time than your OS supports. Databases for file storage are basically just ways of getting around those problems, and keeping extra metadata catalogs on hand to find the required information in a more efficient manner.

Depending on just what the requirements are, I might go with the file system for the message bodies, and a database for the metadata (posting time, who posted, thread tracking, etc). I might also go with a heirarchical database, rather than a relational database, if that fit well with the anticipated characteristics. I might also look at repurposing an NNTP server, rather than starting from scratch.

Personally, I wouldn't optimize for storage space, unless you're not expecting anyone to read the posts. I'd optimize for reads/writes. Depending on the nature of the forum, I might have an aging system, that moves the entries from a read or write optimized system to an alternate storage mechanism for long-term storage.

  • Comment on Re: Large chunks of text - database or filesystem?

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://440935]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2022-05-23 11:57 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (82 votes). Check out past polls.