|laziness, impatience, and hubris|
Re: Alternatives to DB for comparable listsby mxb (Monk)
|on May 16, 2018 at 10:04 UTC||Need Help??|
If I understand correctly, you wish to obtain the following for each file:
Where the files are distributed over six servers.
This probably depends upon how you are planning to collect all the data, but my personal approach would be to have a small script running on each of the six servers performing the hashing and sending each result back to a common collector. This assumes network connectivity.
I think it would be relatively easy to calculate the tuple of the five items for each server with a script and issue them over the network back to a central collection script. Each server can be hashing and issuing results simultaneously to the same collector.
While there may be a lot of data to hash, the actual results are going to be small. Therefore, as you know exactly what you are obtaining (the five items of data) I would just go the easiest route and throw them in a table in DBD::SQLite.
Then, once you have all the data in your DB, you can perform offline analysis as much as you want, relatively cheaply.
As a side note, I'd probably go with SHA-256 rather than MD5 as MD5 collisions are more common, and it's not that much more computationally expensive.