A problem rather similar to Comparing two arrays, don't you think? Sparse matrix again, this time 3k by 3M booleans, density 1/60. Exact same solution is viable, too: pack your int-vectors as "l/l", merge-sort n query vectors and scan. Perhaps an 8-core machine to achieve 4k queries per sec.
Speed vs memory trade-off is also possible. Pairwise intersections of your 3000 vectors amount to 4.5M vectors of size ~833. Lookup with small n==4 is 6 combinations ie merge 6*833 == 5k elements instead of 4*50k == 200k elements. About 30-fold speed-up at the cost of 14 GB of memory.
GPU-based solution would be quite interesting, but for that you really ought to ask another forum.