Regarding the concatenations, my implementation has several differences with respect to your original proposal:
- The real coordinates are normalized by dividing by the cutoff (which is relatively small), and rounded to the next lower integer (i.e., floored). The "concatenation" is numerical; $n = 1000000*$x + 1000*$y + $z + 500500500. Problem here: I'm assuming that the entire system is within +/- 500*$cutoff. I'll have to fix that somehow.
- I use a hash to store the atoms. Each "bucket" ($hash{$n}) may contain more than one atom (usually 1-4), so it's actually an array reference. Atoms are Chemistry::Atom objects, which know their own symbol, name, real coordinates, etc.
- The buckets don't need to be sorted; I just loop for each of them, and then check for bonds with the N^2 algorithm within the bucket (remember N is small), and then check for bonds with the atoms in the neighboring buckets (this requires N*M checks if there are M atoms in the neighbor). "Neighbor" is defined in a way that avoids counting the same pair twice.
Note: the neighbors of $n are $n+1, $n+1000, $n+1000000, etc., for a total of 13 neighbors (a cube is surrounded by 26 cubes, but you only need half of them to avoid counting twice)