Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Best guess for data type

by InfiniteSilence (Curate)
on Apr 22, 2013 at 16:51 UTC ( #1029924=note: print w/replies, xml ) Need Help??


in reply to Best guess for data type

  • Develop a set of heuristics - (ex. \d+\.?\d+? or S+, etc.)
  • Apply these to a random sampling of the data
  • Establish a confidence level that the given data are X
  • Proceed under that presumption unless proven wrong in which case modify definition of X to Y

I suppose there are hundreds of other ways to go about this. The reason I chose the above is that you could have millions of pieces of data to look at and exhaustively looking at each column would be a bit absurd. Besides, you would probably only need to 'catch' an error when trying to perform an activity with a subset like obtaining a standard deviation. In that case you would check each value anyway.

Celebrate Intellectual Diversity

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1029924]
help
Chatterbox?
choroba . o O ( The Czech at-sign is also edible )
[Eily]: the French @ is a mouthful I guess...
[Eily]: choroba Wikipedia gives me Zavináč which doesn't seem to translate to something edible?
[Eily]: (yes I'm very good at using question marks, thank you)
[Discipulus]: O_O frexit for you! even me i tried. it was in the arabic quarter in Granada; they offered as 'tapas' in bars. I eataly somewhere they eat ;=P Sardinia a Pulia iirc
[choroba]: Zavináč
[Eily]: choroba oh, that makes sense :)

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2017-11-21 10:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (297 votes). Check out past polls.

    Notices?