PerlSensitive Sunglasses  
PerlMonks 
Re^5: Out of Memory when generating large matrix (space complexity)by Anonymous Monk 
on Mar 07, 2018 at 13:25 UTC ( #1210450=note: print w/replies, xml )  Need Help?? 
Well, I glanced "(space complexity)" in the title and thought there is a glimmer of hope for you, yet. You have identified the problem area, but again deftly avoid seeing the light. Sorting (or deduplicating) is a problem with O(n log n) time complexity. If you have a hash function that successfully distributes the keys, you can cut down the problem and move some of the complexity into space domain. Hashes are O(n) both in time and space complexity (list insertion). Streaming merge is O(1) in space complexity. Partial hashing is possible. Using a hash table of size k, you can modify the algorithm to achieve O(n log(n/k)) in time complexity, and O(k) in space complexity. The k scales well until you break out of the CPU caches, after which it scales rather poorly. I referenced another thread where someone run into a brick wall trying to hash just 36M elements. Sortuniq proved to be greatly superior in that case. So far, you have
By the way, I never argued that a hash count was unsuitable. By all means, ++$count{$key} if that works. But you chose to attack a broken clock, and forgot that a broken clock, too, is right two times a day.
In Section
Seekers of Perl Wisdom

