Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^4: Alternative to bytes::length()

by ikegami (Patriarch)
on Dec 23, 2009 at 15:02 UTC ( [id://814106]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Alternative to bytes::length()
in thread Alternative to bytes::length()

Strings can contain bytes. You don't have to do anything special to work with bytes. use bytes; has nothing to do with manipulating bytes.

If you need to manipulate the internal string format to optimize or to work with some buggy XS,
You want utf8::upgrade or utf8::downgrade.
If you need you need to encode to UTF-8 or decode from UTF-8,
You want utf8::encode, Encode::encode, utf8::decode or Encode::decode.

The person probably wants to eliminate it because of that very misconception you expressed. But don't worry, if anything is ever done, it would still be available on CPAN.

Replies are listed 'Best First'.
Re^5: Alternative to bytes::length()
by assemble (Friar) on Dec 23, 2009 at 15:47 UTC
    I'm talking more about situations where I'm manipulating binary data, and I don't want Perl even looking at the data and trying to guess what it is.

    Instead of going through every single record in the file and unpacking the whole thing, it is often more efficient to use substr to get the few bytes i actually care about, and work with those. It would be similar to working with an actual character array in C.

    Reading through bytes gives me the impression that Perl will try to figure out what kind of string I've got based on what's in it & where it came from unless I tell it otherwise.

      I'm talking more about situations where I'm manipulating binary data, and I don't want Perl even looking at the data and trying to guess what it is.

      Me too. Perl never guesses "what it is". It has no way of doing that.

      Instead of going through every single record in the file and unpacking the whole thing, it is often more efficient to use substr to get the few bytes i actually care about, and work with those.

      You don't need use bytes; to do that. It does nothing.

      use Test::More tests => 4; my $bin = join '', map chr, 0..255; utf8::downgrade $bin; # One internal format no bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'no bytes, UTF8=0'); use bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'use bytes, UTF8=0'); utf8::upgrade $bin; # Other internal format no bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'no bytes, UTF8=1'); use bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'use bytes, UTF8=1');
      1..4 ok 1 - no bytes, UTF8=0 ok 2 - use bytes, UTF8=0 ok 3 - no bytes, UTF8=1 ok 4 - use bytes, UTF8=1

      hum, I expected the last to fail. I have some details about bytes wrong. It might be less harmful than I thought, just more useless.

        It might be less harmful than I thought

        I have tried to show you this here. Notice the different scope of the bytes pragma in examples 5 and 6.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://814106]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-03-19 06:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found