http://www.perlmonks.org?node_id=871868


in reply to pluggable/dynamic data processing/munging/transforming module?

sounds like not just the program is a hack, but also the table design. in my opinion it should be so that if any transformation is to be applied to any column, it should be capable of being applied to all values/rows of the column. so you should end up with (if any) just functions that apply to certain columns. I say "if" because there's quite a lot you can achieve directly in SQL code, which would obviate the need to procedurally apply functions row at a time. if you could achieve such an outcome, it would be much cleaner all round.
the hardest line to type correctly is: stty erase ^H
  • Comment on Re: pluggable/dynamic data processing/munging/transforming module?

Replies are listed 'Best First'.
Re^2: pluggable/dynamic data processing/munging/transforming module?
by rwstauner (Acolyte) on Nov 17, 2010 at 03:37 UTC

    Thanks for the suggestion. I agree that fixing the data at the source would be optimal, but it doesn't apply to my current situation.

    The entire purpose of my application is to pull data from (various) outside sources and bring it inside to save it in our database.

    "Cleaning" the data is a necessary part of the process.

    some examples:

    • '1969-12-31 23:59:59' is a dummy value and not an actual date (think -1), so I'd prefer to transform it to NULL before filling my own database with garbage. (But this transformation only applies to one external source.)
    • 20101015 is an integer, not a date. I'd prefer '2010-10-15'. (This example is obviously from a different source.)
    • 'D129', '  D129', 'D129  ' all mean the same thing. I'd prefer the trimmed version.
    • The color 'MAROO' probably means 'Maroon', but looks a little silly.