Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^7: Understanding pack and unpack changes for binary data between 5.8 and 5.10

by squentin (Sexton)
on Mar 13, 2009 at 21:26 UTC ( #750525=note: print w/ replies, xml ) Need Help??


in reply to Re^6: Understanding pack and unpack changes for binary data between 5.8 and 5.10
in thread Understanding pack and unpack changes for binary data between 5.8 and 5.10

Ok, I'll try to be clear this time :)
What I wanted is write the string encoded in utf8, and the length, in bytes, of the binary string resulting from pack. So I was using :

my $p=pack "V/a*", $s; my $l=length $p;
When I should have been using :
use Encode qw/encode/; my $p=pack "V/a*", encode('utf8',$s); my $l=bytes::length $p; # using bytes::length just to be sure, $p shouldn't have its utf8 flag + on, but in case it does...
Thinking about it a little more, I think what is disturbing me is that the 'a' in the pack format can be a multi-bytes character. And more generally, the idea that utf8 strings are strings of multi-bytes characters, rather than strings of bytes in utf8 encoding.
perl 5.10's pack behavior does seem to make more sense now.


Comment on Re^7: Understanding pack and unpack changes for binary data between 5.8 and 5.10
Select or Download Code
Re^8: Understanding pack and unpack changes for binary data between 5.8 and 5.10
by ikegami (Pope) on Mar 13, 2009 at 21:44 UTC

    I think what is disturbing me is that the 'a' in the pack format can be a multi-bytes character.

    Me too. You've gotta wonder what's going to happen more often: someone wanting pack non-encoded characters or someone accidentally packing non-encoded characters. I would say the latter, so I find it weird that it doesn't croak ("Wide char in ...") when passed non-encoded characters.

    It could be a side effect of allowing pack and unpack to work with fixed-width fields, where the width is in characters rather than bytes.

    my $rec_format = 'a4a5a1'; my $rec_size = 10; binmode $fh_out, ':encoding(UTF-8)'; print $fh_out pack($rec_format, @fields); ... binmode $fh_in, ':encoding(UTF-8)'; read($fh_in, my $rec = '', $rec_size); @fields = unpack($rec_format, $rec);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://750525]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2014-09-15 10:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (146 votes), past polls