Thank you very much for your help, GrandFather. I apologize for missing the point of your question. indeed building a database seems a plausible thing to do given the large quantity of data i have.
in reply to Re^5: how to merge many files of sorted hashes?
in thread how to merge many files of sorted hashes?
i have about 10,000 such input files, each consisting of a point cloud. i am constructing a hash table for each input file, so in the end i have about 10,000 hashes. (not all hash tables are huge, as most files only have about 20 points as opposed to the 100 points that cause the problem i mentioned here)
eventually what i'd like to do with these hashes is that i will do pairwise comparison and look for common keys between each pair. that information will be used to compute a distance/dissimilarity measure between the pair of point clouds from which the pair of hash tables being compared come from. in the very end i hope to perform clustering on the 10,000 sets of point clouds.