Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^5: Alternative to bytes::length()

by assemble (Friar)
on Dec 23, 2009 at 15:47 UTC ( [id://814115]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Alternative to bytes::length()
in thread Alternative to bytes::length()

I'm talking more about situations where I'm manipulating binary data, and I don't want Perl even looking at the data and trying to guess what it is.

Instead of going through every single record in the file and unpacking the whole thing, it is often more efficient to use substr to get the few bytes i actually care about, and work with those. It would be similar to working with an actual character array in C.

Reading through bytes gives me the impression that Perl will try to figure out what kind of string I've got based on what's in it & where it came from unless I tell it otherwise.

Replies are listed 'Best First'.
Re^6: Alternative to bytes::length()
by ikegami (Patriarch) on Dec 23, 2009 at 16:04 UTC

    I'm talking more about situations where I'm manipulating binary data, and I don't want Perl even looking at the data and trying to guess what it is.

    Me too. Perl never guesses "what it is". It has no way of doing that.

    Instead of going through every single record in the file and unpacking the whole thing, it is often more efficient to use substr to get the few bytes i actually care about, and work with those.

    You don't need use bytes; to do that. It does nothing.

    use Test::More tests => 4; my $bin = join '', map chr, 0..255; utf8::downgrade $bin; # One internal format no bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'no bytes, UTF8=0'); use bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'use bytes, UTF8=0'); utf8::upgrade $bin; # Other internal format no bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'no bytes, UTF8=1'); use bytes; is(substr($bin, 100, 5), "\x64\x65\x66\x67\x68", 'use bytes, UTF8=1');
    1..4 ok 1 - no bytes, UTF8=0 ok 2 - use bytes, UTF8=0 ok 3 - no bytes, UTF8=1 ok 4 - use bytes, UTF8=1

    hum, I expected the last to fail. I have some details about bytes wrong. It might be less harmful than I thought, just more useless.

      It might be less harmful than I thought

      I have tried to show you this here. Notice the different scope of the bytes pragma in examples 5 and 6.

        I knew that use bytes; wasn't harmful in all circumstances. I never said it was, so I don't see why you're saying you were trying to show me it wasn't harmful in that circumstance.

        I backed up everything I said with tests, so it must have been irrelevant to the discussion. I'll review the discussion to make sure, and I'll make a post with my findings.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://814115]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-03-19 02:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found