Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Once I went part-way to a database, I would go all the way. There is really no point in having the overhead of multiple data interfaces.

Yes, a database is usually slower than the filesystem. However, the difference is that a database is scalable. It handles concurrency and scalability for you automatically. (Well, I'm used to using commercial db's .. I'm not sure where mySQL or PostgreSQL stack up here - not a flame, I honestly don't know.)

Let's say your website got really, really popular. You want to handle the load better. Upgrading hardware is one - it means you need to backup and restore to the new machine - but you also have to make sure you get all your extra files across, too. Instead, you may just want to add a second machine to the fray, and use IP round-robin to spread the load. (Or any other method of spreading the load.) I'm also presuming you spring for gigabit ethernet to connect the boxes to each other on a private network - your regular internet connection should not see any traffic between your machines.

DB & filesystem

You're looking at a number of options. I'm going to deal with the filesystem first, because the DB will be dealt in the DB-only section.

  1. Replication of files from one node to another. This may be done via rsync, but it means some files won't be visible to one node until the next rsync. Risky.
  2. Share via NFS. NFS isn't exactly the most reliable software out there, but then again, this is HTTP we're serving over anyway. However, the NFS server is going to get hit hard. Every read, every write, goes over NFS to the server. The speed is likely to be comparable to the database server now.
  3. Share via NAS or SAN. Both machines access the files directly. I'm not entirely sure how locking works on these ... presumably it works the same as if it were local. At this point, you'd put your database on the NAS or SAN, too. Expensive, though.

Upgrading again, you may make one machine both an NFS server and a DB server, and two machines are acting as web/cgi servers. Which may mean more moving around. You're running two servers on this machine (NFS,DB). It's not really sounding compelling to me.

DB-only

Put everything on the db. One machine acts as DB server, the other as a client, both as web servers. You can scale this as much as you want - create a cluster of DB servers that act as a single server, and a cluster of web servers that talk to the DB server cluster. Nearly unlimited scalability here. Put your DB on big iron if you want/need. Secure the whole thing by closing down unneeded ports, including NFS.

You can control the web servers as completely independant servers from each other, and control the DB server(s) as completely independant from the web server. Even if they're on the same machine.

To me, the scalability of the database is the clear-out winner. It's not even a contest on short-term efficiency.

Disclaimer: I don't do this for a living :-)


In reply to Re: Large chunks of text: in-database or in-filesystem? by Tanktalus
in thread Large chunks of text: in-database or in-filesystem? by BrentDax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2022-05-19 18:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (72 votes). Check out past polls.

    Notices?