Thanks for the suggestion. I agree that fixing the data at the source would be optimal, but it doesn't apply to my current situation.
The entire purpose of my application is to pull data from (various) outside sources and bring it inside to save it in our database.
"Cleaning" the data is a necessary part of the process.
- '1969-12-31 23:59:59' is a dummy value and not an actual date (think -1), so I'd prefer to transform it to NULL before filling my own database with garbage. (But this transformation only applies to one external source.)
- 20101015 is an integer, not a date. I'd prefer '2010-10-15'. (This example is obviously from a different source.)
- 'D129', ' D129', 'D129 ' all mean the same thing. I'd prefer the trimmed version.
- The color 'MAROO' probably means 'Maroon', but looks a little silly.