That's why Davido's module uses XS and C++. An optimised Perl implementation would have a challenge to re-invent the processor in some ways, whereas in C++, structs can be used to map one datatype over another without any conversion, bit-shifting or reassignment making it easier to simulate a 2^N bit machine, where N is arbitrary.
I think you've missed the point. You said it would be more space efficient to use a bit array vs. a hash. That's right if your inputs are small, as a bit array (in Perl or C) takes only only N bits (N being the max entry), vs. a hash which in Perl is quite large per entry. Storing the result of every prime below 10M takes about 67MB as a hash, while only 1.5MB as a bit array (could be even less with some optimizations). But if I store the result of 10k random 32-bit numbers, then the hash wins by a huge amount.
My example was 1e18. This one entry now takes 1e18 bits to store just the single result. You can't do that, whether it is C or C++ or Perl. You need a different way to think about it. Your first solution is also a non-starter, as you certainly don't want to generate or store the 2.47e16 primes under that value. Unfortunately Math::Prime::FastSieve doesn't work at these large sizes, while Math::Prime::XS and Math::Prime::Util (ntheory) do.
Primality is not the same problem as prime generation.