Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: [Win32, C, and way OT] C floats, doubles, and their equivalence

by BrowserUk (Patriarch)
on Jul 18, 2009 at 18:45 UTC ( [id://781371]=note: print w/replies, xml ) Need Help??


in reply to [Win32, C, and way OT] C floats, doubles, and their equivalence

The problem is routed in how v6, (pre-v8 maybe, but I only have v6 and v8), generates the code. The following 'fixes' the problem, though I realise that it may not be a workable solution for you:

#include <stdio.h> cmpFsFd( float s, double d ) { float tmp = (float)d; return s == tmp ? 1 : 0; } int main(void) { double nv = 2.0 / 3; float foo = 2.0 / 3; if( foo == nv ) printf("True "); else printf("False "); if( cmpFsFd( foo, nv )) printf("True\n"); else printf("False\n"); return 0; }
[19:48:23.40} C:\test>cl float.c /Fefloatv8.exe Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 +for 80x86 Copyright (C) Microsoft Corporation. All rights reserved. float.c Microsoft (R) Incremental Linker Version 9.00.30729.01 Copyright (C) Microsoft Corporation. All rights reserved. /out:floatv8.exe float.obj [19:49:59.60} C:\test>floatv8 False True ---------------------------------------------------- c:\test>cl float.c /Fefloatv6.exe Microsoft (R) 32-bit C/C++ Standard Compiler Version 13.00.9466 for 80 +x86 Copyright (C) Microsoft Corporation 1984-2001. All rights reserved. float.c Microsoft (R) Incremental Linker Version 7.00.9466 Copyright (C) Microsoft Corporation. All rights reserved. /out:floatv6.exe float.obj c:\test>floatv6 False True

Basically, when you coerce a double to a float before comparing it to a float, you need to force the compiler to store the coerced value as a float before doing the comparison. That's what my cmpFsFd() is doing. (Insert underscores to taste :)

The reasoning is that it is only when the values are stored to memory (moved out of the FP registers), that the actual rounding/truncation occurs. Whilst values remain within the FP registers they are maintained as 80-bit FP values, regardless of whether they originate as 32-bit or 64-bit FPs.

The v8 (and presumably other compilers) do the coercion ((float)nv), by storing and and reloading to a temporary 32-bit memory location:

; 15 : if( foo == (float)nv ) printf("True\n"); fld QWORD PTR _nv$[ebp] ## Load nv onto FPU stack fstp DWORD PTR tv79[ebp] ## store (and pop) it into a 32-bit ( +float) temporary fld DWORD PTR tv79[ebp] ## load it back onto the FPU stack fld DWORD PTR _foo$[ebp] ## load foo onto the FPU stack fucompp ## do the comparison fnstsw ax ## get the FPU status word into AX test ah, 68 ## 00000044H (Check for equality?) jp SHORT $LN2@main ## Jump push OFFSET $SG2485 ## or not ... call _printf

The equivalent code generated by the V6 compiler omits that store & load step:

; 15 : if( foo == (float)nv ) printf("True\n"); fld QWORD PTR _nv$[ebp] ## Load nv to FPU stack fst DWORD PTR tv78[ebp] ## Store it to a temporary but... *** NEVER LOADS IT BACK *** *** And does the comparison between the FPU register and the m +emory image of foo *** fcomp DWORD PTR _foo$[ebp] fnstsw ax test ah, 68 ; 00000044H jp SHORT $L800 push OFFSET FLAT:$SG801 call _printf

On the v7 compiler, you might get away with using /fp:strict or /fp:precise, but the v6 compiler lacks these options. (For the same reason, I haven't been able to check that theory!)

Maybe someone can come up with a preprocessor macro to map (float)x to something like ( float tmp = (float)d )? (Some of the macros in the Perl sources seem to do equally obscure things, but they fairly make my skin crawl :)

Personally, I'd prefer using cmpFsFd(), and perhaps an editor macro (with manual yea/nay) to change the sources. If the function was marked inline, it might not impose to much of a performance penalty, but you might have to be careful that the compiler doesn't optimise the tmp var away.

Anyway, I hope that is of some use to you.

Reference: http://webster.cs.ucr.edu/AoA/Windows/HTML/RealArithmetica2.html

11.2.5 Conversions The FPU performs all arithmetic operations on 80 bit real quantities. In a sense, the FLD and FST/FSTP instructions are conversion instructions as well as data movement instructions because they automatically convert between the internal 80 bit real format and the 32 and 64 bit memory formats.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: [Win32, C, and way OT] C floats, doubles, and their equivalence
by ikegami (Patriarch) on Jul 19, 2009 at 16:31 UTC

    but you might have to be careful that the compiler doesn't optimise the tmp var away.

    Use volatile to force the compiler to avoid optimising the memory lookup away.

      Use volatile to force the compiler to avoid optimising the memory lookup away

      I think the above suggestion is made in relation to the use of a separate function - which is not the method I've adopted. (I've made use of a temp variable, but it's in the body of the function itself, rather than in a separate function.)

      I find that declaring my temp variable as "volatile" doesn't help me. If I get rid of the #pragma optimize() calls, and declare the temp variable as volatile float temp, the problem remains. Obviously, "volatile" doesn't turn off every kind of optimization, and certainly doesn't turn off the kind of optimization that it needs to (in my case).

      Cheers,
      Rob

        If your code counters what I said, show it.

        Are you saying that the following works:

        #include <stdio.h> cmpFsFd( float s, double d ) { float tmp = (float)d; return s == tmp ? 1 : 0; } int main(void) { double nv = 2.0 / 3; float foo = 2.0 / 3; if( foo == nv ) printf("True "); else printf("False "); if( cmpFsFd( foo, nv )) printf("True\n"); else printf("False\n"); return 0; }

        And that the following doesn't?

        #include <stdio.h> int main(void) { double nv = 2.0 / 3; float foo = 2.0 / 3; volatile float nv_as_float = (float)nv; if( foo == nv ) printf("True "); else printf("False "); if( foo == nv_as_float ) printf("True\n"); else printf("False\n"); return 0; }

        That makes no sense to me.

Re^2: [Win32, C, and way OT] C floats, doubles, and their equivalence
by syphilis (Archbishop) on Jul 19, 2009 at 06:10 UTC
    I think your detailed investigation demonstrates why the code I posted in repsonse to creamygoodness's post behaves the way it does.

    After much poking and scratching and re-reading of suggestions that have been kindly and thoughtfully presented here, I've eventually come up with using this approach in the PDL code:
    #include <stdio.h> #if defined _MSC_VER && _MSC_VER < 1400 #pragma optimize("", off) #endif int main(void) { #if defined _MSC_VER && _MSC_VER < 1400 double nv = 2.0 / 3; float foo = 2.0 / 3; float dummy = (float)nv; if(foo == dummy) printf("True "); else printf("False "); #else double nv = 2.0 / 3; float foo = 2.0 / 3; if(foo == (float)nv) printf("True "); else printf("False "); #endif return 0; }
    It seems to be doing the right thing with all of my compilers.
    I don't think that C script needs to have the optimization turned off - the creation of the dummy variable is alone sufficient to get the behaviour I'm after. But for some reason, in the PDL code, creation of the dummy variable is *not*, by itself, sufficient - optimization also needs to be disabled. (I turn it off for the setvaltobad functions, then turn it back on again.)

    Thanks to *all* who replied.

    Cheers,
    Rob
      But for some reason, in the PDL code, creation of the dummy variable is *not*, by itself, sufficient - optimization also needs to be disabled. (I turn it off for the setvaltobad functions, then turn it back on again.)

      That's why I moved the temp var and comparison into a separate function; it forces the compiler to use the temp value from memory for the comparison:

      ; 5 : return s == tmp ? 1 : 0; fld DWORD PTR _s$[ebp] fcomp DWORD PTR _tmp$[ebp] fnstsw ax test ah, 68 ; 00000044H jp SHORT $L809

      The problem with the test script is that with optimisations enabled, the newer compiler is able to reduce the whole script to a simple printf( "False" ); printf( "True" ); return 0; (even with the use of the sub) as everything is known at compile time:

      ; Listing generated by Microsoft (R) Optimizing Compiler Version 15.00 +.30729.01 TITLE C:\test\float.c .686P .XMM include listing.inc .model flat INCLUDELIB LIBCMT INCLUDELIB OLDNAMES _DATA SEGMENT $SG2527 DB 'True ', 00H ORG $+2 $SG2529 DB 'False ', 00H ORG $+1 $SG2531 DB 'True', 0aH, 00H ORG $+2 $SG2533 DB 'False', 0aH, 00H _DATA ENDS PUBLIC _cmpFsFd EXTRN __fltused:DWORD ; Function compile flags: /Ogtpy ; File c:\test\float.c ; COMDAT _cmpFsFd _TEXT SEGMENT tv135 = 8 ; size = 4 _s$ = 8 ; size = 4 _d$ = 12 ; size = 8 _cmpFsFd PROC ; COMDAT ; 4 : float tmp = (float)d; ; 5 : return s == tmp ? 1 : 0; fld DWORD PTR _s$[esp-4] fld QWORD PTR _d$[esp-4] fstp DWORD PTR tv135[esp-4] fld DWORD PTR tv135[esp-4] fucompp fnstsw ax test ah, 68 ; 00000044H jp SHORT $LN3@cmpFsFd mov eax, 1 ; 6 : } ret 0 $LN3@cmpFsFd: ; 4 : float tmp = (float)d; ; 5 : return s == tmp ? 1 : 0; xor eax, eax ; 6 : } ret 0 _cmpFsFd ENDP _TEXT ENDS PUBLIC _main EXTRN _printf:PROC ; Function compile flags: /Ogtpy _TEXT SEGMENT _main PROC ; 11 : double nv = 2.0 / 3; ## ; 12 : float foo = 2.0 / 3; ## All this and ... ; 13 : ; 14 : if( foo == nv ) printf("True "); ## ; 15 : else printf("False "); push OFFSET $SG2529 call _printf ; 16 : ; 17 : if( cmpFsFd( foo, nv ) ) printf("True\n"); ## this are op +timised away! push OFFSET $SG2531 call _printf add esp, 8 ; 18 : else printf("False\n"); ; 19 : ; 20 : return 0; xor eax, eax ; 21 : } ret 0 _main ENDP _TEXT ENDS END

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        That's why I moved the temp var and comparison into a separate function

        Having a separate function is appealing, but not so straightforward to implement. It would be fine if we had an xs file to fiddle with, but the fact that the source file to be amended is a pd file (not an xs file) adds some complexity to the problem. I think it is possible to use the "separate function" approach - though it's actually "separate functions" (plural), as the templating dictates that we'll need separate functions to handle each of the different data types (ie byte, short, long, long long, double - not just float). There's also the issue of the second arg that gets supplied to the "separate function" - it could be a UV or an IV, not necessarily an NV, so we need to accommodate that as well (probably not difficult).

        It's a much simpler solution if one instead just adds a dummy variable and turns off optimization - which seems to work quite well.

        Cheers,
        Rob

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://781371]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-04-25 09:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found