|Do you know where your variables are?|
Re: Re: Re: Re: Re: A (memory) poor man's hashby tilly (Archbishop)
|on Nov 25, 2003 at 03:51 UTC||Need Help??|
With cache optimization, we need to specify our goals first. If your goal is to achieve a universal large win, or to achieve any kind of ideal optimization, then optimizing cache coherency is an impossible ideal. But that isn't my goal. My goal would be to have data structures which will tend to perform better. And that is quite possible.
Sure, Judy arrays perform best in code that is limited to just working with the array in question. It also will perform better on chips that it has been tuned to, and tuning it to different chips is a never-ending process. However the fact that you have paid attention to how cache-friendly your datastructure is can win. Perhaps when you are working simultaneously with 3 Judy arrays you have each one ruining the cache coherency of the others. But what that means is that you are ageing data out of caches earlier than expected. So instead of finding something in level 1 cache it might be in level 2 cache. This is still a win over using a hash with its tendancy to blow all caches, all the time.
Similarly it is clear that code that was tuned to one set of cache sizes won't work as well with a different set of cache sizes. However it is still a win. Furthermore my claim is that different exponents in Moore's law for improvements in chip speed versus improvements in the rate of data transfer means that what was a win will tend to become a bigger win as time goes by. Sure, something else might now be optimal. But pretending that access times are flat will lose by more in the future than it does now.
On relational databases, I think that you are missing the boat. Sure, a relational database pushes programmers to restructure their data structures and push logic to the database. But the reason why it does that is that when you take that step, the database on the fly figures out far better algorithms than programmers normally would manage to figure out for themselves. Sure, there are a smidgeon of programmers who can beat the relational database. But nobody can beat it consistently without doing a lot of work. (Just consider the amount of reprogramming that you have to do to match what a DBA accomplishes by adding the right index to speed up existing queries.)
And in addition to the speed win, nobody in their right mind thinks that they can match the transactional mechanics that database provides by rolling their own at the application level.
Your dream of effectively utilizing the performance of our processors is the diametric opposite of my dream, and runs counter to every trend in programming.
In various aspects of programming (and life) we make many trade-offs. As the cost of any particular factor drops relative to the others, the natural tendancy is to be willing to waste more of that factor to save on some or all of the others. If you like, you can think of this as a result of some sort of generalized Le Chatelier's Principle. I certainly do.
This can be carried to amazing lengths. For instance a merchant from 200 years ago would be astounded at the speed with which we routinely ship small parcels from New York City to Boston. Said merchant would also be flabbergasted at the idea of possibly routing said packages through Baltimore. But it makes sense to do so today since the incremental costs of transportating things farther have fallen to a point where we are willing to trade absurd amounts of it for the efficiencies of centralized sorting and routing.
In programming this means that the natural response to having more processor speed to work with is not to figure out how to squeeze more speed out to achieve some ideal level of performance. Rather it is to view CPU performance as cheap and start trading it away for everything else that we can.
If we could get performance and everything else that we might want, that would be perfect. But your dream falls aground on the limits of the Halting Problem, you simply cannot compute a perfect static analysis. Oh, you can do heuristics (every optimizer does), and you can do better heuristics with runtime data. (Transmeta attempts to do so with their code-morphing software.) But those attempts add latency issues (if only for the latency to realize when the usage pattern changes), and will work worse and worse as you move to more and more dynamic techniques.
Well that isn't entirely true. There are ways to program that allow optimization to be done on the fly (if only to pieces of your program) to make things run really well. Of course making that work means that programmers have to jump a few hoops, and hand off problems to the computer for it to take final control over. Which can work well for both sides, the computer gets well-defined problems that it can work with, and humans get to define the hard bits.
I'm not sure that you would like that programming direction though. The most successful example of that method of cooperation is relational databases. Which you don't really like. But they do exactly what you want, right down to automatically generating logging and tracing to allow you to monitor and tune the database. In many cases people have the database set up to recalculate statistics periodically and improve its execution paths based on what the current datasets and usage. (Freeing humans to worry less about the best algorithms, and allowing us to focus on higher order pieces.)