This is a very interesting point, one very much worth considering. In Perl, of course, the difference would not be worth mentioning, but in C, my unrolled-loop version is much faster than a version with a loop. I was guessing that it would be twice as fast, but when I actually measured it, it was actually five times as fast.
in reply to Re^3: 5x6-bit values into/out of a 32-bit word
in thread 5x6-bit values into/out of a 32-bit word
Now, of course, it was microseconds v. microseconds, but why are we packing 5 numbers into one 32-bit word? Presumably we care about the space usage, which would only matter if we are using a lot of them, probably millions, so all those microseconds can add up.
The maintenance concerns are, in this specific case, probably not valid. You can't pack 6x6-bit values in a 32-bit word, nor 5x7-bit values. We would have to change the algorithm if anything changed. And, if you actually write out the loop version, you've probably only saved one line of code.
I'm not disagreeing with your principles, but I think that in this case I would probably go with my version.
There's a very good essay, The Fallacy of Premature Optimization. One snippet:
Note, however, that Hoare did not say, "Forget about
small efficiencies all of the time." Instead, he said "about 97% of the
time." This means that about 3% of the time we really should worry about
small efficiencies. That may not sound like much, but consider that this
is 1 line of source code out of every 33. How many programmers worry
about the small efficiencies even this often? Premature optimization is
always bad, but the truth is that some concern about small efficiencies
during program development is not premature.