Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Let us consider caching. The front page of Google is 300k... but 270k of that is cached. 30k a hit. Getting a search result is about 130k, 80k of which is cached. In addition, with a site that large much of your requests will be AJAX calls returning small bits of JSON and XML. We're talking an order of magnitude below the 200k estimate.

Now consider the Twitter problem. You need to efficiently pipe 140 characters to just the followers of each user in real time and you have millions of users constantly sending messages. The problem seems simple, and the payload is small, but it is an extremely expensive calculation. Social networks at the scale you're discussing are CPU intensive.

This brings us to the problem with your calculations: they're wrong. They're wrong because they are premature optimization. Or in this case, premature sub-optimization. The evil of premature optimization is not so much the optimization, it's thinking you can predict performance. It's thinking you can predict the future. It's thinking you know everything you need to know about how a system will be used and react before it actually happens. In any sufficiently complex system the best performance prediction is this: your prediction is wrong. You simply do not have the data. I don't either. Most of us do not.

A site with a million hits a day doesn't just appear out of thin air. Nobody should sit down and try to design a site that big unless they already have a slightly smaller one doing the same thing. That's the only way to get real experience and data to plug into the equations to find the real bottlenecks. Nobody should start by buying the sort of hardware you're talking about. You should start by knowing you will be wrong, plan accordingly, gather metrics and optimize on that. Be modest in your performance expectations until you have something to profile.

You can think of it like the Drake Equation. As a thought experiment and discussion piece about the probability of alien life, it's fantastically focusing. As a practical predictor it's meaningless. Most of the numbers we plug in are sheer speculation. There's so many variables with such high variability being multiplied that the end result swings by orders of magnitude based on what valid seeming estimates you plug in. Errors multiply. You can get anything from millions to just 1, and you'll tweak the results to your liking (nothing personal, just human nature). It's seductive to think that meaningless number is proof to take action.


In reply to Re: How fast is fast? by schwern
in thread How fast is fast? by Logicus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-03-29 01:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found