Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: Byte counts and Seek function

by chromatic (Archbishop)
on Aug 27, 2013 at 21:59 UTC ( #1051192=note: print w/replies, xml ) Need Help??

in reply to Byte counts and Seek function

You're in for a world of pain if you try to mix byte counts with UTF-8, because a UTF-8 glyph may be represented by more than one byte's worth of codepoints. seek doesn't take variable-width encodings into account. It only counts bytes.

(I don't know what your utf8 function does, so I can't comment on what your call to encode does.)

Seems to me that it would be easier to use pos tell when you read in a sentence and keep that position around, rather than try to reconstruct it from the data you've read (and decoded, possibly normalized, et cetera).

Replies are listed 'Best First'.
Re^2: Byte counts and Seek function
by choroba (Bishop) on Aug 27, 2013 at 22:48 UTC
    Are you sure you would use pos? I always thought seek should be used with tell.
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Yes, you're right. I was thinking of fgetpos in C for some reason (and even there I'd use ftell, so I don't know what I was thinking at all).

Re^2: Byte counts and Seek function
by AnomalousMonk (Chancellor) on Aug 27, 2013 at 22:40 UTC

    utf8 (emphases added):

    utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source
    The "use utf8" pragma tells the Perl parser to allow UTF-8 in the
    program text in the current lexical scope ...

      That's the utf8 pragma. I know what it does in the posted code: nothing, because there are no non-ASCII characters appearing literally in the source code.

      What's the utf8 function in the OP's code do?

        Oops. Visually scanned for it, but didn't see the utf8 function call the first time through. Should have used a highlighting finder!   (Damned human eyes...)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1051192]
[erix]: advantage of insomnia: having a good look at Jupiter with its four galilean moons

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2018-04-20 03:18 GMT
Find Nodes?
    Voting Booth?