Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^5: An exploration of higher-level-language approaches to the "First all-zero row in an NxN matrix" problem (goto Considered Harmful)

by BrowserUk (Pope)
on Sep 26, 2011 at 04:05 UTC ( #927786=note: print w/ replies, xml ) Need Help??


in reply to Re^4: An exploration of higher-level-language approaches to the "First all-zero row in an NxN matrix" problem (goto Considered Harmful)
in thread An exploration of higher-level-language approaches to the "First all-zero row in an NxN matrix" problem (goto Considered Harmful)

Cache misses is another big factor that's usually ignored in theory.

Hm. Actually, I think caching has to be ignored. There is simply no way to account for it.

On single core processors (without hyper-threading), with single-level caching and fixed processor clock speeds, you can measure the differences between cache-friendly & cache-hostile algorithms -- under very controlled conditions; say single-tasking OS or otherwise quiescent systems -- but you cannot predict them much less formalise them.

But once you have multiple, multi-threading cores; with 2 or 3 levels of caching, some per-core, others shared, some split between instructions and data others mixed; dynamically varying clock speeds; variably-timed instructions (think FPU & GPU interactions), out-of-order execution and pipelining; multi-tasking OSs with dynamic priorities and a dynamic mix of hardware and software interrupts; there is simply no way to formalise algorithms to account for any of those sources of non-determinability. Much less all of them.

And it is getting harder. Given the rise of virtualisation throwing more and more independent workloads onto each box. Add to that the rise of clustering further increasing the probabilities of cache misses and the benefits of increasing cache sizes starts to fall off very rapidly indeed.

Even just measuring performance -- when an operating printer or cooling fan on the same desk can cause enough jiggle on an otherwise static optical mouse, to cause the OS to give a foreground application a priority boost -- is an exercise in futility.

Whilst it is possible to demonstrate that under ideal conditions, an algorithm like Judy arrays can -- at a cost of huge complexity -- derive some level of advantage from data locality, relying upon it when so many other random events can sweep the cache out from under you is naive.

Personally, I think that the days of trading complexity to achieve speed are long over, at which point the fundamentals of efficient algorithms comes back to exactly the kind of count-the-opcodes instrumentation that Knuth has exemplified for all these years.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re^5: An exploration of higher-level-language approaches to the "First all-zero row in an NxN matrix" problem (goto Considered Harmful)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://927786]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (16)
As of 2014-07-23 16:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (147 votes), past polls