Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Alternatives to DB for comparable lists

by Perlbotics (Chancellor)
on May 16, 2018 at 18:32 UTC ( #1214680=note: print w/replies, xml ) Need Help??

in reply to Alternatives to DB for comparable lists

One approach might be:

  • setup a DB-Server on your collection host
  • run your MD5 tool on each host and depending on your network availability:
    • with networking: contact DB and INSERT the new data on the fly (via internal network or SSH-/VPN-tunnel)
    • w/o networking: output data line by line in a format that your DB supports for batch-loading (store in file for offline transport)
  • run your tasks on the DB

Perhaps sending the batch-lines to STDOUT is the easiest approach where the tool could even be invoked by an ssh-command issued on the collection host? That also eliminates the requirement for DB drivers on the host to be scanned.

Use a header/trailer or checksum to assert completeness/integrity of the chunk of lines transmitted and perhaps also add some interesting meta-data (creation time, IP, etc.).


Oh, you asked for DB-alternatives... Rough estimation: 750k entries with a mean entry size of ca. 500 bytes results in a total size of approx. 375 MB. My experiment with Storable resulted in a file of size 415 MB. Reading/writing took ca. 2.0/3.5s on a moderate PC (3GHz, SSD).

Merging and storing all data into a native Perl data structure and using Storable for persistence looks feasible. PRO: fast speed for analytics; CON: no luxury that comes with a DB.

  • Comment on Re: Alternatives to DB for comparable lists

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1214680]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2018-07-21 19:04 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (450 votes). Check out past polls.