Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

You are still missing too much information to make a decision

  1. What is the maximum posts per minute you expect?
  2. What is the maximum reads per minute?
  3. How many overall posts per day? (estimated growth rate)
  4. How large do you expect the messages to be?
  5. Is it a write once system, or will there be re-editing of messages?
  6. What is your hardware budget for the project, or is there fixed hardware?
  7. What is the required uptime?
  8. Are you going to have an internal search engine?
  9. If so, what sort of information are you going to search on? (metadata, or the message itself?)
  10. What are your disaster recovery requirements?
  11. Do you need to support transactional concurency?
  12. What are your time constraints?
  13. Do you already have a database to use for this purpose?
  14. Do you already have experience with databases?

Moving lots of files around is not a problem. Tar and rsync are your friends. The only problem with files comes when you're trying to work with more files at the same time than your OS supports. Databases for file storage are basically just ways of getting around those problems, and keeping extra metadata catalogs on hand to find the required information in a more efficient manner.

Depending on just what the requirements are, I might go with the file system for the message bodies, and a database for the metadata (posting time, who posted, thread tracking, etc). I might also go with a heirarchical database, rather than a relational database, if that fit well with the anticipated characteristics. I might also look at repurposing an NNTP server, rather than starting from scratch.

Personally, I wouldn't optimize for storage space, unless you're not expecting anyone to read the posts. I'd optimize for reads/writes. Depending on the nature of the forum, I might have an aging system, that moves the entries from a read or write optimized system to an alternate storage mechanism for long-term storage.


In reply to Re: Large chunks of text - database or filesystem? by jhourcle
in thread Large chunks of text - database or filesystem? by TedPride

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2022-05-20 23:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (76 votes). Check out past polls.

    Notices?