Yes, there sometimes are situations where you legitimately must have in-memory millions of data-points such that you instantaneously must have access to them all. In those cases, you must have more than sufficient RAM with uncontested access to it. Otherwise you are going to inevitably hit the “thrash point,” and when that happens, the performance degradation is not linear: it is exponential. The curve looks like an up-turned elbow, and you “hit the wall.” That is certainly what is happening to the OP.
BrowserUK’s algorithm is of course more efficient, and he has the RAM. In the absence of that resource, no algorithm would do. (And in this case, the prequisite of sufficient RAM is implicitly understood.) You can still see just how much time it takes, just to allocate that amount of data, even in the complete absence of paging contention. And the real work has not yet begun!
Frequently, large arrays are “sparse,” with large swaths of missing values or known default values. In those cases, a hash or other data structure might be preferable. Solving the problem in sections across multiple runs might be possible. You must benchmark your proposed approaches as early as possible, because with “big data,” wrong is “big” wrong.