Re: Best guess for data type

Develop a set of heuristics - (ex. \d+\.?\d+? or S+, etc.)
Apply these to a random sampling of the data
Establish a confidence level that the given data are X
Proceed under that presumption unless proven wrong in which case modify definition of X to Y

I suppose there are hundreds of other ways to go about this. The reason I chose the above is that you could have millions of pieces of data to look at and exhaustively looking at each column would be a bit absurd. Besides, you would probably only need to 'catch' an error when trying to perform an activity with a subset like obtaining a standard deviation. In that case you would check each value anyway.

Celebrate Intellectual Diversity

In Section Seekers of Perl Wisdom