For most problems like this, I move over to using a programming language with more context-specific tools like R
. At least a dozen clustering methods are available and there are facilities for estimating the "true" number of clusters. Using it is fairly simple from perl. Just write a little script that you call from the command line that reads a file of numbers that you supply. The output can be graphical or some numeric summary--whatever suits your needs. You can also use the perl module, Statistics::R
to interact without the intermediate running of an R script from the commmand line.