Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^3: Alternative to bytes::length()

by assemble (Friar)
on Dec 23, 2009 at 14:15 UTC ( #814100=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Alternative to bytes::length()
in thread Alternative to bytes::length()

Why would they eliminate the bytes pragma? What about those of us who aren't always manipulating character data and actually do care about the bytes themselves?


Comment on Re^3: Alternative to bytes::length()
Re^4: Alternative to bytes::length()
by ikegami (Pope) on Dec 23, 2009 at 15:02 UTC

    Strings can contain bytes. You don't have to do anything special to work with bytes. use bytes; has nothing to do with manipulating bytes.

    If you need to manipulate the internal string format to optimize or to work with some buggy XS,
    You want utf8::upgrade or utf8::downgrade.
    If you need you need to encode to UTF-8 or decode from UTF-8,
    You want utf8::encode, Encode::encode, utf8::decode or Encode::decode.

    The person probably wants to eliminate it because of that very misconception you expressed. But don't worry, if anything is ever done, it would still be available on CPAN.

      I'm talking more about situations where I'm manipulating binary data, and I don't want Perl even looking at the data and trying to guess what it is.

      Instead of going through every single record in the file and unpacking the whole thing, it is often more efficient to use substr to get the few bytes i actually care about, and work with those. It would be similar to working with an actual character array in C.

      Reading through bytes gives me the impression that Perl will try to figure out what kind of string I've got based on what's in it & where it came from unless I tell it otherwise.

        I'm talking more about situations where I'm manipulating binary data, and I don't want Perl even looking at the data and trying to guess what it is.

        Me too. Perl never guesses "what it is". It has no way of doing that.

        Instead of going through every single record in the file and unpacking the whole thing, it is often more efficient to use substr to get the few bytes i actually care about, and work with those.

        You don't need use bytes; to do that. It does nothing.

        use Test::More tests => 4; my $bin = join '', map chr, 0..255; utf8::downgrade $bin; # One internal format no bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'no bytes, UTF8=0'); use bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'use bytes, UTF8=0'); utf8::upgrade $bin; # Other internal format no bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'no bytes, UTF8=1'); use bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'use bytes, UTF8=1');
        1..4 ok 1 - no bytes, UTF8=0 ok 2 - use bytes, UTF8=0 ok 3 - no bytes, UTF8=1 ok 4 - use bytes, UTF8=1

        hum, I expected the last to fail. I have some details about bytes wrong. It might be less harmful than I thought, just more useless.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://814100]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2014-10-25 07:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (142 votes), past polls