Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^3: Get unique fields from file

by LanX (Saint)
on Jan 08, 2022 at 15:27 UTC ( [id://11140269]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Get unique fields from file
in thread Get unique fields from file

> Depending upon the data of course, your HoH (hash of hash) structure could consume quite a bit more memory than the actual file size in MB.

This shouldn't be a problem if you a apply a sliding window technique° plus splitting the hashes into easily swappable chunks².

The trick is to balance time, space and disk access, by minimizing the the number of swaps.

This will scale well, until the limit given by disk-space.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

°) see

²) see

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11140269]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2026-02-06 21:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.