Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^3: Capture a non-printable char and test what it is

by cavac (Parson)
on May 24, 2022 at 14:56 UTC ( [id://11144160]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Capture a non-printable char and test what it is
in thread Capture a non-printable char and test what it is

If I look at the ascii table

then you only see a part of the non-printable characters available on modern computer systems. Unicode has more control characters, emoji skin tone modifiers, right-to-left mark and a host of other stuff that is unprintable on it's own.

As an additional bonus, the same character on screen can sometimes be encoded in Unicode in multiple ways, see Unicode equivalence.

Unfortunately, input processing has gotten a tad more complex since the world gave up on ye olde ASCII table. On the bright side, these days more than the 20% of world population of the old ASCII days can now type their name into a computer with a reasonable expectation that it will be processed correctly.

perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

Replies are listed 'Best First'.
Re^4: Capture a non-printable char and test what it is
by kcott (Archbishop) on May 25, 2022 at 06:34 UTC

    I generally agree with everything you've written there; however, as a minor nitpick, those skin tone modifiers can be printed in isolation. I'm not sure how this will render on different browsers, but on my terminal:

    $ perl -C -E '
        say for
            "\N{U+1F3FB}",
            "\N{U+1F3FC}",
            "\N{U+1F3FD}",
            "\N{U+1F3FE}",
            "\N{U+1F3FF}"
    '
    🏻
    🏼
    🏽
    🏾
    🏿
    

    And, in a preview, that looks fine on my Firefox v100.0.2 — YMMV.

    — Ken

      You are right, those are printable when used standalone. Which makes them sometimes-printable-characters. Great, another exception that has to be handled when working with text.

      You are in the hallways of the text processing convention. To the sout +h, you see someone selling T-Shirts, to the north is the building exit. The entrance to the lecture hall is to the west. > complain about unicode cavac raises his fist to the gods and shouts "UNICODE!!!". Höðr shoots + cavac in the buttocks with a mistletoe arrow.

      perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11144160]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (7)
As of 2024-04-24 21:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found