Someone once said that a web application is truelly scalable if all one needs to do is to buy more machines. This is very appealing to me. Imagine you start a website, first with one box containing both the web server and database server; then as your users grow, you just add servers and load balancers; and eventually you add data centers (host in both west and east coast, europe, etc.) Ideally this should be done in a high availability fashion.
My question is what are the main design decisions to accomplish this? (e.g., not storing anything in sessions?, but load balancers can be sticky), cluster of webserver and database servers? how does database replication/synchronization work across data centers (how fast and how expensive?) How does the big boys (google, yahoo, ebay, amazon etc.) accomplish their scalability?
Anything special about Perl/mod_perl one can use or needs to be careful about? Thanks.