Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Is the documentation for Perl 5.20 'pack' correct?

by flexvault (Monsignor)
on Jul 06, 2015 at 20:19 UTC ( [id://1133435]=perlquestion: print w/replies, xml ) Need Help??

flexvault has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I saw that 5.20.0 was released and I looked to see if 'pack' had a 64bit 'network' integer, but instead I saw the following description in the documentation for 'pack'

    Byteorders "1234" and "12345678" are little-endian; "4321" and "87654321" are big-endian.
To me that is wrong and I verified with 'pack'ing 16909060 with the 'N' network parameter which is hex '01020304' onto disk and it was in the order I expected.

Am I losing it, or should the documentation say the opposite?

Regards...Ed

"Well done is better than well said." - Benjamin Franklin

Replies are listed 'Best First'.
Re: Is the documentation for Perl 5.20 'pack' correct?
by Corion (Patriarch) on Jul 06, 2015 at 20:27 UTC

    Endianness even has examples that try to illustrate the issue. Depending on how you display "increasing address space", the values/ordering may appear the other way around.

    On the subject itself, you'll have to tell us what the byte order on disk was :-)

      Depending on how you display "increasing address space", the values/ordering may appear the other way around.

      The OP was using strings and Perl strings (displayed in English) increase addresses to the right.

      So, yes, the quoted documentation disagrees with wikipedia.

      - tye        

        Wikipedia neither agrees nor disagrees with the OP's quote, since the part he quoted makes no sense out of context. 12345678 has no inherent endianness.

        Wikipedia neither agrees nor disagrees with the docs, as it is silent on the value output by perl -V:byteorder.

        'So, yes, the quoted documentation disagrees with wikipedia.'

        And what does that mean?

      Hello Corion,

        ...you'll have to tell us what the byte order on disk was :-)

      Displaying the disk file in hex, the packed number 16909060 is:

      my $packed_number = pack( "N", 16909060 );
      displayed as hex '01020304' :-)

      Further, you can just do the following:

      my $packed_number = pack( "N", 16909060 ); print unpack("H8",$packed_number),"\n";
      which yields hex '01020304'.

      Regards...Ed

      "Well done is better than well said." - Benjamin Franklin

Re: Is the documentation for Perl 5.20 'pack' correct?
by BrowserUk (Patriarch) on Jul 06, 2015 at 21:20 UTC

    I'd concur that the docs are misleading, if not outright wrong:

    #! perl -slw use strict; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'END_C', NAME => 'Endian', CLEAN_AFTER_BUILD =>0; SV *hexDump( UV in ) { int i; char *p = (char*)&in; SV *out = newSVpvn( NULL, 0 ); for( i=0; i<8; ++i ) { sv_catpvf( out, "%02x", p[i] ); } return out; } END_C print hexDump( 72623859790382856 ); __END__ C:\test>endian Use of uninitialized value in subroutine entry at C:\test\endian.pl li +ne 19. 0807060504030201

    (Aside: If anyone groks why I get the uninitialized value in subroutine entry warning; please enlighten me?)

    But you have to factor in the context and tools you use to display the results.

    For example, the first two below clearly demonstrate big and little endian respectively; the rest show how easy it is to get misleading results:

    C:\test>p1 [0] Perl> print unpack 'C*', pack 'Q>', 72623859790382856;; 1 2 3 4 5 6 7 8 [0] Perl> print unpack 'C*', pack 'Q<', 72623859790382856;; 8 7 6 5 4 3 2 1 [0] Perl> print unpack 'H*', pack 'Q<', 72623859790382856;; 0807060504030201 [0] Perl> print unpack 'h*', pack 'Q<', 72623859790382856;; 8070605040302010 [0] Perl> print unpack 'H*', pack 'Q>', 72623859790382856;; 0102030405060708 [0] Perl> print unpack 'h*', pack 'Q>', 72623859790382856;; 1020304050607080 [0] Perl> print unpack 'b*', pack 'Q>', 72623859790382856;; 10000000 01000000 11000000 00100000 10100000 01100000 11100000 0001000 +0 [0] Perl> print unpack 'B*', pack 'Q>', 72623859790382856;; 00000001 00000010 00000011 00000100 00000101 00000110 00000111 0000100 +0 [0] Perl> print unpack 'b*', pack 'Q<', 72623859790382856;; 00010000 11100000 01100000 10100000 00100000 11000000 01000000 1000000 +0 [0] Perl> print unpack 'B*', pack 'Q<', 72623859790382856;; 00001000 00000111 00000110 00000101 00000100 00000011 00000010 0000000 +1

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
      (Aside: If anyone groks why I get the uninitialized value in subroutine entry warning; please enlighten me?)

      Can't say that I actually grok it, but if you remove the sv_catpvf() call it goes away.

      Cheers,
      Rob

        Okay. I found a way to make it go away. Initialising out silences the warning:

        #! perl -slw use strict; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'END_C', NAME => 'Endian', CLEAN_AFTER_BUILD =>0; SV *hexDump( UV in ) { int i; char *p = (char*)&in; SV *out = newSVpvn( "?", 1 ); for( i=0; i<8; ++i ) { sv_catpvf( out, "%02x", p[i] ); } return out; } END_C print hexDump( 72623859790382856 ); __END__ C:\test>endian ?0807060504030201

        Which I guess means that the doc for newSVpvn() that reads:"If the s argument is NULL the new SV will be undefined. means something different to what I took it to mean.

        This also works:

        SV *hexDump( UV in ) { int i; char *p = (char*)&in; SV *out = newSVpvn( "", 0 ); for( i=0; i<8; ++i ) { sv_catpvf( out, "%02x", p[i] ); } return out; }

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
      Did you actually read the docs? We're talking about perl -V:byteorder.

        ikegami,

        With all due respect, the topic is not about Perl's definition of byteorder, but the definition of and examples for little-endian and big-endian in the on-line documentation for Perl's 'pack' function. Perl's byteorder is academic! Please see the following from a RS/6000 with AIX 5.2 operating system:

          pyrperl -v

          This is perl 5, version 12, subversion 2 (v5.12.2) built for aix Copyright 1987-2010, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5 source ki +t. Complete documentation for Perl, including FAQ lists, should be found +on this system using "man perl" or "perldoc perl". If you have access to + the Internet, point your browser at http://www.perl.org/, the Perl Home Pa +ge.
        and

          pyrperl -V:byteorder
          byteorder='4321';

        and

          pyrperl -e '$p=pack("N",16909060); print unpack("H8",$p),"\n";'
          01020304
        Until now, I did not know that the RS/6000 has a software switch on the motherboard to indicate running in little-endian or big-endian. When I read the Camel book, and it said that the "N" parameter of 'pack' put the result in Network or big-endian format, I knew what that meant and it didn't have an example. My original problem was with the on-line documentation.

        In goggling this, many authors seem to be guessing!

        Regards...Ed

        "Well done is better than well said." - Benjamin Franklin

Re: Is the documentation for Perl 5.20 'pack' correct?
by ikegami (Patriarch) on Jul 06, 2015 at 23:58 UTC

    The docs are correct.

    For example, x86 is a LE platform, and byteorder is 1234 (32-bit IV) or 12345678 (64-bit IV) on that platform.

    >perl -V:archname archname='MSWin32-x86-multi-thread'; >perl -V:byteorder byteorder='1234';
    >perl -V:archname archname='MSWin32-x86-multi-thread-64int'; >perl -V:byteorder byteorder='12345678';

    x64 is also LE.

    >perl -V:archname archname='MSWin32-x64-multi-thread'; >perl -V:byteorder byteorder='12345678';

    I don't have access to a BE platform to confirm that byteorder is 4321 or 87654321 there.

      byteorder is 12345678 on that platform.

      That is a completely meaningless statement:

      print $Config{ archname };; MSWin32-x64-multi-thread $n = 0x12345678;; print unpack 'C*', pack 'V', $n;; 120 86 52 18 print unpack 'C*', pack 'NV', $n;; 18 52 86 120 0 0 0 0 print unpack 'C*', pack 'N', $n;; 18 52 86 120 print unpack 'C*', pack 'Q<', $n;; 120 86 52 18 0 0 0 0 print unpack 'C*', pack 'Q>', $n;; 0 0 0 0 18 52 86 120

      Until you define what the phrase "byteorder is 12345678" means; it means nothing useful.

      It could mean:

      • The decimal value 12345678, is stored in memory as ...
      • Or, the eight bytes of a 32-bit value are stored in memory as ...
      • Or, the hex value 12345678 is stored in memory as ...
      • Or, the sequence of ascii characters 12345678 is stored in memory as ...

      Except that the ... in any of those statements is not specified; which renders the statement -- whatever the intended interpretation -- completely meaningless.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
        STop making a fool of yourself and go read the docs as I previously suggested. We're talking about perl -V:byteorder.
Re: Is the documentation for Perl 5.20 'pack' correct?
by flexvault (Monsignor) on Jul 07, 2015 at 23:12 UTC

    Dear Monks,

    I have the following script which I ran on 2 different machines:

    #!/usr/local/bin/perl # use strict; use warnings; use Config; print "Byte order: $Config{byteorder}\n"; # expect byteorder='43 +21' or '1234'; my $hexstring = "01020304"; # Hex number my $decnumber = hex $hexstring; print "Hex: $hexstring is Dec: $decnumber\n"; # expect 1690 +9060 my $netnumber = pack( "N", $decnumber ); print "Dec: $decnumber is 'N': '",unpack("H8",$netnumber),"' as Ne +twork or Big-endian\n"; my $Nonnetnumber = pack( "V", $decnumber ); # Vax or little-e +ndian print "Dec: $decnumber is 'V': '",unpack("H8",$Nonnetnumber),"' as + Vax or Little-endian\n"; $Nonnetnumber = pack( "L", $decnumber ); # as native format print "Dec: $decnumber is 'L': '",unpack("H8",$Nonnetnumber),"' as + native format\n";
    The results were:
    On Aix 5.2 on RS/6000 ( powerpc-ibm-aix5.2.0.0 ) # pyrperl ./pack.pl Byte order: 4321 Hex: 01020304 is Dec: 16909060 Dec: 16909060 is 'N': '01020304' as Network or Big-endian Dec: 16909060 is 'V': '04030201' as Vax or Little-endian Dec: 16909060 is 'L': '01020304' as native format On Debian Linux on 64bit AMD ( Debian 4.4.5-8 ) # pyrperl ./pack.pl Byte order: 1234 Hex: 01020304 is Dec: 16909060 Dec: 16909060 is 'N': '01020304' as Network or Big-endian Dec: 16909060 is 'V': '04030201' as Vax or Little-endian Dec: 16909060 is 'L': '04030201' as native format
    The results agree with Writing endian-independent code in C. My knowledge of big-endian came from writing machine to machine transfer software for IBM in the '70s.

    The background was that IBM 32-bit mainframes were big-endian and expensive, and the new in-expensive (???) 8-bit boxes were little-endian.

    As you can see the 'N' and 'V' 'pack' formats are the same on both architectures but the native formats are different. Perl's 'pack' is correct, but the documentation examples are incorrect.

    Regards...Ed

    "Well done is better than well said." - Benjamin Franklin

      what examples? can you produce a patch for pelfunc pointing out the wrongs?
Re: Is the documentation for Perl 5.20 'pack' correct?
by Anonymous Monk on Jul 07, 2015 at 00:31 UTC

    But that is a different number :)

    https://metacpan.org/pod/perlfunc#pack says

    The integer formats s, S, i, I, l, L, j, and J ... For example, a 4-byte integer 0x12345678 (305419896 decimal) would be +ordered natively (arranged in and handled by the CPU registers) into +bytes as 0x12 0x34 0x56 0x78 # big-endian 0x78 0x56 0x34 0x12 # little-endian

    So I type

    $ perl -V:byteorder byteorder='1234'; $ perl -e"print 0x12345678 " 305419896 $ perl -e"print unpack q{H*}, pack q{N*}, 0x12345678 " 12345678 $ perl -e"print unpack q{H*}, pack q{N*}, 305419896 " 12345678 $ perl -e"print sprintf q{%x}, 305419896 " 12345678
    Which seems to agree with above and  N  An unsigned long (32-bit) in "network" (big-endian) order.

    N packs 305419896 correctly as big-endian, and my machine is big endian (as sprintf shows)

      The docs don't say the output of perl -V:byteorder has any relationship to 0x12345678. It simply (and correctly) says it's 1234 or 12345678 on LE systems.

        The docs don't say the output of perl -V:byteorder has any relationship to 0x12345678. It simply (and correctly) says it's 1234 or 12345678 on LE systems.

        You're a great communicator ikegami

Re: Is the documentation for Perl 5.20 'pack' correct?
by Anonymous Monk on Jul 07, 2015 at 00:53 UTC
Re: Is the documentation for Perl 5.20 'pack' correct?
by RichardK (Parson) on Jul 07, 2015 at 16:48 UTC

    It looks perfectly clear to me.

    If you're storing a 32 bit integer, on a little endian machine you get

    [byte 1][byte 2][byte 3][byte 4] increasing memory address ->

    while on a big endian machine you get

    [byte 4][byte 3][byte 2][byte 1] increasing memory address ->

    Obviously, big endian is superior and the only true way to store data in memory ;)

Re: Is the documentation ... correct (the name of the rose)
by Anonymous Monk on Jul 07, 2015 at 20:57 UTC

    Adding to RichardK above: the byteorder string is not an integer value (hex or otherwise), it just gives the memory layout.

    Interesting that human thinking seems to be inherently little-endian itself — the "1" for "first" corresponds with the least significant byte. Or more likely, it acknowledges the ranking or integer types: small values are a subset of bigger integers' values, so each larger type extends the range. In this sense, the extra bytes are additional hence enumerated in that order. But enough with Semiotics.

    Glibc headers for example come with the following definitions in /usr/include/endian.h:

    #define __LITTLE_ENDIAN 1234
    #define __BIG_ENDIAN    4321
    #define __PDP_ENDIAN    3412
    
    ... which allows for convenient testing a la #if __BYTE_ORDER == __LITTLE_ENDIAN; clearly this convention is not a perl quirk.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1133435]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-03-19 03:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found