Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^3: Match full utf-8 characters

by Allasso (Monk)
on Apr 29, 2019 at 13:36 UTC ( [id://1233116]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Match full utf-8 characters
in thread Match full utf-8 characters

...Although, I find this works when I collect user input with the construct:
my $user_input = <STDIN>;
but does not work when I collect it with:
my $stdin = new IO::Handle; $stdin->fdopen( fileno( STDIN ), "r" ) || die "Cannot open STDIN"; while ( my $char = $stdin->getc() ) { }
and I need to use hippo's suggestion below. In each iteration, $char is a byte. Is there a way to coerce $char to be utf8?

Replies are listed 'Best First'.
Re^4: Match full utf-8 characters
by Anonymous Monk on Apr 29, 2019 at 13:55 UTC

    That's a very convoluted way of achieving something that's already done for you:

    $ perl -E'say STDIN->can("getc")' $ perl -MIO::Handle -E'say STDIN->can("getc")' CODE(0x55dddd02b150)
    I.e. if IO::Handle is loaded, STDIN is already an object resembling IO::Handle and you can call getc method on it.

    But for the sake of the exercise, you should be able to pass "<:utf8" instead of "r" to fdopen and have getc return Unicode characterscode points again. (untested)

      You're probably way above my head here, but when using STDIN->can("getc") it does not wait for user input, which is what I am doing with my construct.

      Also, attempting to use "<:utf8" instead of "r" gives IO::Handle: bad open mode... error.

      I also tried moving binmode(STDIN, ':utf8') call to after opening the filehandle as recommended here: https://perldoc.perl.org/perlfunc.html#binmode-FILEHANDLE%2c-LAYER -- but that did not help.

        STDIN->can("getc") being defined means that you can call the method: STDIN->getc().

        Sorry about giving you misleading information on the "mode" parameter of IO::Handle::fdopen. From my understanding of the documentation (For the documentation of the "open" method, see IO::File. ... If "IO::File::open" is given a mode that includes the ":" character, it passes all the three arguments to the three-argument "open" operator.), it should have worked.

        I also tried moving binmode(STDIN, ':utf8') ... but that did not help.

        That is because you don't use the STDIN file handle, but create another one with the UTF-8 decoding layer stripped. Try setting ":utf8" binmode on STDIN, then using calling STDIN->getc (or just getc, since STDIN is chosen by default) in a loop.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1233116]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2024-04-25 02:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found