I really have to go with chip on this one. I played around with optimization for about 1/2 an hour and found that nothing I could do affected the performance by more than 1% or so.
I did not have as much time as I would have liked to spend on it, so I did not run profiling tools on that section, nor look at the actual code generation. I focused on that small section because I have a lot of experience with C, and frequently have to do optimizations of that nature because of the microcontrollers I work with (microcontroller compilers typically produce far worse code than GCC, since GCC benefits from a wide developer base, and most microcontroller compilers are 1 to 5 person shops).
My other interest in speeding it up was a trade off with making it fewer lines. Those 12 or so lines just seemed nasty. If it can be reduced to fewer lines, it often becomes more maintainable. Given that, I have the following replacement:
Normally, Inline::C uses -O2 optimization, which is fairly agressive about fragments like that. It's smart enough to recognize that memcpy()s can be replaced with several instructions if the length to move reduces to a compile time constant.
One would question if the above is actually more maintainable, since the byte copy of new_date = date; is more indicative of what the operation is intending to do. If one is aiming for speed and readability, the new_date = date; relies more on the working knowledge of the compiler.
No matter what the case, this isn't really the area that needs work to speed this puppy up.
In reply to (jcwren) Re: Premature Optimization is the Root of All Evil