Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^3: porting C code to Perl

by Monk::Thomas (Friar)
on Oct 24, 2017 at 11:43 UTC ( [id://1201960]=note: print w/replies, xml ) Need Help??


in reply to Re^2: porting C code to Perl
in thread porting C code to Perl

The predigit value may set to 0 magically with some C compilers

Since this value is defined as a global variable (not inside a function) and is an integral data type (not part of an object/struct) it is safe to assume 'predigit' is initialized as zero by the compiler. I guess the author of the C code wasn't sure and tried to initialize both variables (just in case) ... and got it wrong. However it doesn't matter, because the '= 0' doesn't actually add anything.

test #1 - define at global scope

#include <stdio.h> int value1, value2 = 9; void main(void) { printf("%d %d\n", value1, value2); }
prints '0 9', not 9 9

test #2 - define in function

#include <stdio.h> void func(void) { int value1, value2 = 9; printf("%d %d\n", value1, value2); } void main(void) { func(); }
may print '996873600 9' or '-1944996480 9' or '-51174016 9'

Replies are listed 'Best First'.
Re^4: porting C code to Perl
by marioroy (Prior) on Oct 24, 2017 at 14:33 UTC

    However it doesn't matter, because the '= 0' doesn't actually add anything.

    From testing, it seems to matter on the Mac platform including CentOS 7.3. I compared the output with this site. No warnings are emitted on CentOS 7.3.

    Without '= 0'

    predigit, nines = 0; -- Mac OS X, Apple LLVM version 7.3.0 (clang-703.0.31) demo.c:41:13: warning: expression result unused [-Wunused-value] predigit, nines = 0; ^~~~~~~~ 1 warning generated. 0314159265358979323846264338327954288419716939937510582097494459230781 +6406286208998628734825342117067 | + | 4 + 7 -- Linux, gcc version 4.8.5 20150623 0314159265358979323846264338327954288419716939937510582097494459230781 +6406286208998628734825342117067 | + | 4 + 7

    With '= 0'

    predigit = 0, nines = 0; 0314159265358979323846264338327950288419716939937510582097494459230781 +6406286208998628034825342117067 | + | 0 + 0
      Sorry, for some reason I believed
      int nines = 0; int predigit = 0;
      to read
      int nines, predigit = 0;

      instead and this was what I was talking about (and showcasing in the tests). In this situation '= 0' adds nothing.

      For 'predigit, nines = 0;': You are totally correct - it does make a difference. I assume this is a bug in the original code which has not been spotted when verifying the result using the widely deployed Eyeball Mark 1 test suite.

        As you discovered,

        int nines, predigit = 0;

        is not the same as

        int nines = 0; int predigit = 0;

        I'm surprised int nines, predigit = 0; is even valid C syntax.1

        However, nines, predigit = 0; is valid C syntax, but treats nines and predigit = 0 as 2 separate expressions. nines is evaluated in void context (so, the warning you got), while 0 is evaluated in an integer context and then assigned to predigit (effectively, predigit is in lvalue context).

        An alternate syntax that would have zero'd both is nines = predigit = 0; which may be what the author intended to write (but he should have gotten the warning).

        ---

        1 Note that in Perl, my ($nines, $predigit) = 0; will init $nines to 0 and leave $predigit undefined. Also, it doesn't give a warning.

        Hi Mark::Thomas. I apologize for not providing the compiler warning, initially with line number.

        By the way, the explicit initialization to zero can make a difference in some cases. From gcc docs:

        -fno-zero-initialized-in-bss
        If the target supports a BSS section, GCC by default puts variables that are initialized to zero into BSS. This can save space in the resulting code.

        This option turns off this behavior because some programs explicitly rely on variables going to the data section—e.g., so that the resulting executable can find the beginning of that section and/or make assumptions based on that.

        This could be useful, for example if you want your "hot" global variables to go together in .data section for improved data locality. But there are better ways to achieve the same.

      Edit: There are two different things being talked about and I'm replying to different than Mario was discussing. Apologies -- too many threads and too many versions of code being discussed for me to keep track. The message below discusses the initialization of globals, as some versions of the Rabinowitz/Wagon code just use globals for the variables and the understanding that the C spec means they get initialized to 0. This was what I interpreted "The predigit value may set to 0 magically with some C compilers" to be referencing. The stackoverflow code does initialize them, but Mario points out that the line "predigit, nine=0" in the middle of the loop is basically the same in this case as "predigit; nine = 0" which sets nine to zero and that's it. On this line indeed no proper compiler would modify predigit. That's clearly a confusing statement that makes it unclear what was intended. See the original 1995 article for the Pascal code that looks suspiciously similar, and indeed it was supposed to set predigit to 0. Arndt and Haendel's book on page 82 points out this implementation has some bugs and they give correct C code. But all of this is tangential to initialization in this particular code. since it clearly initializes predigit and nines at the top.

      That's strange. It doesn't do that on my Mac or RH Linux system. Any compiler that does would seem to be broken, as Monk::Thomas points out.

      If this was done inside a function such as main, then it is uninitialized and may or may not be zero depending on the whim of the compiler, but it rightly warns you that you're doing it wrong. The fact that you're getting that warning means your example probably doesn't have that line at file scope (e.g. a global).

      That said, one thing I've learned is that maintenance and readibility are important. Principle of least surprise, etc. I would really want to see the variables (1) local to the function, and (2) initialized. If there is some reason they have to be globals, either initialize them explicitly so every reader can clearly see it, or write a comment explaining it. I suppose if you're working deep in Bell Labs in the 70s or Berkeley in the 80s then maybe you could be excused for having a much better opinion of your co-workers and future self. But not now. Witness this whole subthread :). Besides, a nice self-contained function that takes an argument for number of digits seems much more professional. Otherwise you're just making one-off maintenance-hell scripts even though it's C.

      In the original posting's link, the next answer has a version of the Winter/Flammenkamp implementation (without overflow correction). It still uses globals (boo) but at least initializes some, and even includes comments. Hilariously even with all of that, it still relies on d being set to 0 because it is a global.

      re the multiple variables on a line with / without initialization, I've started doing more of that. Trying to save previous newlines I suppose :/ I still find I dislike the "b=c-=14" sort of construct used here, but it bothers me less now than it did 5 years ago.

Re^4: porting C code to Perl
by haukex (Archbishop) on Oct 24, 2017 at 18:00 UTC
    it is safe to assume 'predigit' is initialized as zero by the compiler

    One should be careful with this assumption, I've seen several C compilers for microcontrollers that don't do this by default, in order to save on instruction memory (which can matter on tiny uC's).

      So you're saying that this code:

      #include <stdio.h> int value; void main(void) { printf("%d\n", value); }

      may output something else then 0 on these platforms/compilers? Looking at ANSI C (as the smallest common base) I would say these compilers are in violation of the C Standard and cannot be truly called C. (But I may be wrong - I'm not a greybearded C guru.)

      From the C89 standard, section 3.5.7

      If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.

      (I am deliberately quoting C89 and not a more recent standard too make sure this isn't a later addition / clarification.)

        So you're saying that this code ... may output something else then 0 on these platforms/compilers?

        Yes, exactly. Of course, this was something like over five years ago and the compiler may very well be obsolete by now. I don't have much time right now to dig out references, but based on my memory, this was either on a Microchip PIC or TI MSP430, and the gist of it was this: Say you defined a 256-byte char array as a global variable, then what this compiler would do to initialize it is create 256 null bytes in instruction memory, and copy those over to RAM byte by byte. Of course on a tiny uC that was a huge waste of instruction memory, and it was much better to drop that initialization routine (or the compiler did this by default, I don't remember exactly), leave the variables uninitialized, and just write a for loop to initialize them yourself, or use memset, which amounts to the same thing.

        Since these are uCs without an OS, one might suspect that when the uC starts it jumps right into main (since that's basically what it does when you code in assembly), but that's not the case - C needs a bit of initialization (like setting up interrupt vectors), so the compiler had some per-uC assembler files that took over the initialization tasks and that were set up to run on startup; these assembly routines would then do their thing and jump into main. And IIRC there were different versions of these initialization routines, some that would initialize global memory and some that wouldn't, and you could even hack them yourself if you needed custom initialization. AFAIK more modern compilers have much more efficient code to initialize RAM, but basically the thing I took away from this was to always be suspicious of uninitialized variables in C :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1201960]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-19 11:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found