In Perl, of course, the difference would not be worth mentioning, but in C, my unrolled-loop version is much faster than a version with a loop.

Yes, but this is Perl we are talking about, a language where a cosine is really no more expensive than an addition, not C. Perl is about flexibility, C is about speed.

if we are using a lot of them, probably millions

If you have millions of them, then Perl is the wrong language to use.

All I know is that seeing those multiples of 6 raises a red flag, at least as far as Perl is concerned. That code strikes me as fragile, in that it does not adapt to changing requirements readily (which is one of the reasons I code in Perl). Remember the cardinal virtue of Laziness.

Also, while I don't know why BrowserUK wants to do this, I maintain my reasoning is as valid as yours. There are dozens of ways of packing small bitmaps into a 32-bit quantity, and far more can be packed into a 64-bit quantity, as 64-bit CPUs become more prevalent.

