http://www.perlmonks.org?node_id=1204982

DazedConfuzed has asked for the wisdom of the Perl Monks concerning the following question:

I've been using Perl scripts as an interface to my MS Access (Office 2007) databases for quite some time but recently ran into what seems to be an obscure issue. I say 'obscure' because internet searches with appropriate keyword terms usually land me on a PerlMonks page where usually quickly find my answer since others have experienced similar issues before me. I can't seem to dig up a pre-existing explanation/solution and so I present a distilled synopsis of the original more complicated problem.

The Setup:

Basically, I've distilled this down to a simple example using a simplified MS Access DB. The database (Play.mdb) is sitting in a folder and a symbolic data source (System DSN) has been created called "PLAY". I have to use the 32 bit version of the ODBC Data Source Administrator as I'm connecting to an older Office 2007 database which is only available in 32 bit architecture (though my Win 7 is 64-bit). Yup. I'm a dinosaur using an old OS and older (purchased) copy of MS Office and I don't feel like upgrading to new software if that is the solution.

The simplified database contains a single table called Test which has but a single row and contains 4 columns generically labeled (A-D). These columns all have the Data Type=Number and columns A-C are defined as Single while I've changed column D to Double.

The Code:

No laughing at inefficient or naïve Perl code (I learn just enough Perl to get by).

use DBI; my ($dbh, $sth, $a, $b, $c, $d); $dbh = DBI->connect('DBI:ODBC:PLAY', '', '', {RaiseError => 1, AutoCom +mit => 1}); $sth = $dbh->prepare("SELECT A, B, C, D FROM Test"); $sth->execute(); ($a, $b, $c, $d) = $sth->fetchrow; print "A=$a\n"; print "B=$b\n"; print "C=$c\n"; print "D=$d\n"; print "\n"; $sth = $dbh->prepare("SELECT A+0, B+0, C+0, D+0 FROM Test"); $sth->execute(); ($a, $b, $c, $d) = $sth->fetchrow; print "A=$a\n"; print "B=$b\n"; print "C=$c\n"; print "D=$d\n";
The Output:
A=6.02518 B= C= D=8.99280607700348E-2 A=6.02517986297607 B=3.50719429552555E-2 C=8.99280607700348E-2 D=8.99280607700348E-2

The Problem:

Basically, in a nutshell, I noticed this when certain calculated values in a database table were being returned as empty (but not 'undef') values from my query. You'll see in the first query that the values for B and C are just plain missing. The second query only varies by adding zero to the values for the columns (which I assume may be holding the computed amount as a Double value rather than Single). This allows you to get an idea of what the missing numbers are in the first query. To test this precision theory I changed column D in the table to a Double instead of a Single and then it started showing up as expected in the first query. It should be noted that this problem only seems to occur when the (single precision) value ends up being expressed as a number in scientific notation (e.g. "8.992806E-02").

As I mentioned earlier, this problem is really as simple of a distillation as I can seem to make from what was a much larger program that generates many MS Access tables and fetches the entire contents of the table with a ($sth->fetchall_arrayref) before pasting it into a programmatically constructed spreadsheet representation of the database table. I've tried to strip-away all that clutter to show a simpler example of scientific notation numbers not being returned as expected (or, at least, as I expected) from a simple query.

As my actual query involves a "SELECT * FROM $table_name" type of a query, the "+0" hack will not work as a solution (would be too ugly to live with anyway). My other option is to go back through all of my code that generates these tables and only use DOUBLE rather than SINGLE in my CREATE TABLE statements. This may provide an appropriate workaround (though I do not need that much precision on these values). The other (more tedious) change might be to find the parts of the code that are calculating these values and specifically round the results to just a handful of digits which should keep some of them from having to be expressed by scientific notation (which seems to be the root of this problem).

As I need to progress on with this, I'll likely try a SINGLE => DOUBLE type solution to see if that will get me past this roadblock but I'd really like to know why it seems that single precision numbers expressed in scientific notation cannot be retrieved through the DBI interface. I'm open to all the enlightenment I can get on this quirky little issue.

Thanks.

Been dazed and confused for so long...

Replies are listed 'Best First'.
Re: Problems with values expressed in scientific notation not being returned through DBI
by huck (Prior) on Dec 07, 2017 at 03:07 UTC

    As a shot in the dark, try adding this

    use Devel::Peek; print "a:\n"; Dump($a); print "b:\n";Dump($b); print "c:\n";Dump($c); print "d:\n";Dump($d);
    After each of the print series

      Interesting, Devel::Peek is a look into the internals of Perl that I never thought I'd need to see. Not as disturbing as watching sausage being made but some gory details about variable storage exposed with this method.

      To see how nulls in the DB ('undef' as far as Perl is concerned) look when dumped, I added a fifth column (E) to the database table with a null value and added that to the query and dumped output. It seems that Dump sends its output to stderr and so I had to comingle the output streams when running this from a DOS command prompt (and then edit the result to collate the two streams). This is the result of the first case where the Single database columns containing values expressed in scientific notation (columns B & C) are returned with a "missing" value (a null byte--not quite the same as an empty string):

      a=6.02518 SV = PV(0x277ae34) at 0x47d2b4 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x275664c "6.02518"\0 CUR = 7 LEN = 10 COW_REFCNT = 1 b= SV = PV(0x277ae5c) at 0x47da94 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x275672c "\0"\0 CUR = 1 LEN = 10 c= SV = PV(0x277ae54) at 0x47da34 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x27567ac "\0"\0 CUR = 1 LEN = 10 d=8.99280607700348E-2 SV = PV(0x277ae4c) at 0x47d134 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x28ac1bc "8.99280607700348E-2"\0 CUR = 19 LEN = 21 COW_REFCNT = 1 e= SV = PV(0x277aeac) at 0x47d1f4 REFCNT = 1 FLAGS = () PV = 0

      As expected, in the second query where I added zeros to the columns (A+0, B+0, C+0, D+0, E+0) in the query, the problematic B and C columns are recast as double precision values and show up properly and look normal using Dump(). I'm not much on Perl internals and haven't needed to immerse myself in perlguts just yet so I'm hoping that something in the output above sheds some light on why Single values expressed in scientific notation cannot seem to be reliably returned with a fetch using DBI.

        While unsure about what is going on, i must admit that was a totally unexpected result

        i was wondering if you had run into something called a dualvar.

        use strict; use warnings; $|=1; use Scalar::Util qw(dualvar); use Devel::Peek; print "nv1\n"; my $nv1 = 1e-2 ;Dump($nv1); print "nv1: +$nv1 ".($nv1+0)."\n"; Dump($nv1); print "dv2\n"; my $dv2 = dualvar( 1e-2, "hi" );Dump($dv2); print "dv2: +$dv2 ".($dv2+0)."\n"; Dump($dv2);
        Result
        nv1 SV = NV(0xa7c024) at 0xa3e614 REFCNT = 1 FLAGS = (PADMY,NOK,pNOK) NV = 0.01 nv1:0.01 0.01 SV = PVNV(0x3f7364) at 0xa3e614 REFCNT = 1 FLAGS = (PADMY,NOK,pIOK,pNOK) IV = 0 NV = 0.01 PV = 0xa959d4 "0.01"\0 CUR = 4 LEN = 36 dv2 SV = PVNV(0x3f73c4) at 0xa8facc REFCNT = 1 FLAGS = (PADMY,NOK,POK,pNOK,pPOK) IV = 0 NV = 0.01 PV = 0xa45c5c "hi"\0 CUR = 2 LEN = 10 dv2:hi 0.01 SV = PVNV(0x3f73c4) at 0xa8facc REFCNT = 1 FLAGS = (PADMY,NOK,POK,pIOK,pNOK,pPOK) IV = 0 NV = 0.01 PV = 0xa45c5c "hi"\0 CUR = 2 LEN = 10
        Of particular interest here is the line "dv2:hi 0.01" where in string context you get hi, but in numeric you get 0.1. But it doesnt seem that dualvars are your problem.

Re: Problems with values expressed in scientific notation not being returned through DBI
by Anonymous Monk on Dec 06, 2017 at 15:26 UTC
    A floating-point number might be expressed in several ways upon output without changing its value. What does Data::Dumper show for any of these variables? Do any appear as NULL? If you look at the table in MS-Access, what do you see?

      In my more complicated code (before I distilled it down to this simple example), I used Data::Dumper to take a peek at the row returned from the fetch. The actual row contained a few null columns that were correctly returned and displayed as 'undef' but the Single values that were rendered in exponential notation in the DB were not shown as 'undef' but simply as '' (as if the number was a string--possibly, just guessing here, triggered by the "E" in the exponential format).

      When I view the Single values that are giving me problems they appear simply as floating point numbers expressed in scientific notation:
      3.507194E-02
      8.992806E-02

      As I had to develop a workaround for this issue I decided upon limiting the mantissa of these values by rounding these numbers to 5 decimal digits when creating the table by using "Round( ... ,5)" in the code. By limiting the resolution of the mantissa these numbers are kept from switching into scientific notation which is all that is needed to get around this bug so that I can complete my work. My other option was to create my table using Double instead of Single for any floating point numbers but I had a distaste for the excessive precision that resulted. I'm still wondering what is going on deep in the bowels of DBI and if there is some rare subtle bug that has remained hidden lo these many years or if I'm just being boneheaded somehow and don't yet see it.

        It could well be that the data is being returned in character form – and that these characters are somehow not being parsed correctly, e.g. by some driver layer – but that appears quite nonsensical since exponential notation is well-understood to be part of any floating-point number's possible character expression. Could there be some Microsoft DAO or dot-Net option afoot here?