In the pursuit of efficiency I have come up with a slightly different implementation: Carl1.
In Carl2 I tried to optimize at the expense of a few more characters.
On my WinNT PIII 800Mhz machine with ActiveState 522 both of these are faster than ase's Bit Vector implementation.
One thing that I found interesting is that for n less than about 10,000, Carl1 is significantly faster than Carl2.
The loop control overhead must be pretty significant.