Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: How to print the actual bytes of UTF-8 characters ?

by choroba (Cardinal)
on Feb 06, 2014 at 15:01 UTC ( [id://1073715]=note: print w/replies, xml ) Need Help??


in reply to How to print the actual bytes of UTF-8 characters ?

Using a variable as a file to handle the encodings:
#!/usr/bin/perl use warnings; use strict; use utf8; for my $char (qw(Ð Ñ Ò Ó)) { my $n = ord $char; open my $BYTE, '>:utf8', \ my $bytes; print {$BYTE} $char; printf "%s\t%s\t%x\t%b\t%x %x\t %b %b\n", $char, $n, $n, $n, (unpack('CC', $bytes)) x 2; }

The pivoting of the table left as an exercise to the reader.

لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re^2: How to print the actual bytes of UTF-8 characters ?
by RCH (Sexton) on Feb 06, 2014 at 16:18 UTC

    Magic!
    Could you explain how it works?
    I had tried a simple minded unpack('C', $char) but it gave me the wrong answer.
    There are two things that I dont understand in your unpack solution
    (1) what are the contents of $bytes, and
    (2) what is the function of the slash "\" in

    open my $BYTE, '>:utf8', \ my $bytes;

    ?

      \ is the reference operator. Instead of using a file, I open the variable for output (see FILEHANDLE, MODE, REFERENCE in open). I set its encoding to UTF-8 and print the character to it. $bytes now contains the two bytes of the character as encoded in UTF-8.
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
        \ is the reference operator

        Ah - of course
        Many thanks

Re^2: How to print the actual bytes of UTF-8 characters ?
by ikegami (Patriarch) on Feb 07, 2014 at 21:14 UTC
    open my $BYTE, '>:utf8', \my $bytes; print {$BYTE} $char;? utf8::encode(my $bytes = $char);!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1073715]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-03-29 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found