Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^6: The 10**21 Problem (Part 3)

by eyepopslikeamosquito (Canon)
on May 17, 2014 at 23:13 UTC ( #1086471=note: print w/ replies, xml ) Need Help??


in reply to Re^5: The 10**21 Problem (Part 3)
in thread The 10**21 Problem (Part 3)

There's no need to align the prefetch pointer.
Whoops, yes you are right. I made a silly mistake in my original test; with that blunder fixed, with your version, there is no difference in running speed (both run in 38 seconds).

Curiously, my version runs in 37 seconds with:

_mm_prefetch(&bytevecM[(unsigned int)m7 & 0xffffff80], _MM_HINT_T0); _mm_prefetch(&bytevecM[64+((unsigned int)m7 & 0xffffff80)], _MM_HINT_T +0);
versus 40 seconds with:
_mm_prefetch(&bytevecM[(unsigned int)m7], _MM_HINT_T0); _mm_prefetch(&bytevecM[(unsigned int)m7 ^ 64], _MM_HINT_T0);
I have no explanation for that, unless perhaps the prefetcher likes to prefetch in order (?).

Thanks for the other tips. Optimizing for the prefetcher seems to be something of a dark art -- if you know of any cool links on that, please let me know.


Comment on Re^6: The 10**21 Problem (Part 3)
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1086471]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (14)
As of 2015-07-02 08:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (31 votes), past polls