http://www.perlmonks.org?node_id=482758

DrAxeman has asked for the wisdom of the Perl Monks concerning the following question:

I'm having trouble getting proper averages calculated in my tables. All of the data in my table (a CSV file) is wrapped in quotes. For DBI/SQL to calculate the average on the column, I need to remove the quotes. No biggie, except that some data is in scientific notation. When I do my averages, I'm getting outrageous numbers.

Starting data
"5.3271433710011766e-006","5.3271433710011766e-006"

After I strip the "'s and -'s I've got

5.3271433710011766e006,5.3271433710011766e006

The true value of these numbers is

0.00000532714337100118,0.00000532714337100118

How can I get these properly calculated, or should I just grep for these types of numbers and change the value to 0 since they are so small?

Also, if I were to modify the value to 0, what is the regex that would say "if there is an e in the word, change the whole word to something else"?

Replies are listed 'Best First'.
Re: SQL Calculations issue
by AReed (Pilgrim) on Aug 10, 2005 at 21:03 UTC
    If you strip the '-' you're changing the exponent from a negative number to a positive number. The result is that what was a very small number is now very large. You should be able to just strip the quotes and perform your calculation.
      Excellent! It worked. Early in the process I was having problems with come non-alphanumeric characters, so I just stripped them all. Thanks!
Re: SQL Calculations issue
by sgifford (Prior) on Aug 10, 2005 at 21:05 UTC
    Perl seems to understand scientific notation when converting strings to numbers. I think your problem is stripping the minus signs. That means you turn, for example, 3.4e-2 (.034) into 3.4e2 (340). If you just leave the minus sign in place, I think you'll get correct results.

    Also, consider using a CSV module from CPAN for this; it can take care of most of the parsing for you.