http://www.perlmonks.org?node_id=1195588

mje has asked for the wisdom of the Perl Monks concerning the following question:

I have just tested some code on Perl 5.24.2 which was running happily on 5.16.0 and got a warning of "sysread() is deprecated on :utf8 handles". Now I'm wondering how to fix it and I'm a little confused by the discussion at RFC remove strange behaviour of sysread()/syswrite() on UTF-8 streams.

The file handle in question is a unidirectional pipe created before a fork and the parent is writing XML down the pipe (which has :encoding(UTF-8)) layer and the child is using sysread on the read end of the pipe which also has encoding(UTF-8). The child then sends the data down a socket which also has encoding(UTF-8) layer on it.

I'm slightly confused by the :utf8 and encoding("UTF-8"). I'm doing the latter but the RFC seems to indicate there is a difference between :utf8 and encoding("UTF-8") and that sysread works with :utf8 but not encoding(something else)? So why do I get the deprecated warning?

Is the solution simply to leave the encoding on the write end of the pipe which uses print to send messages down the pipe, but leave it off the read end of the pipe and decode the read message as UTF8 before sending it down the onwards socket?

CORRECTION: the connection between the parent and child is a socket although I don't think it makes any difference

UPDATE1: It appears I can't just use :raw on the socket read end and decode to UTF-8 as I might have read part of a UTF-8 sequence. Still searching for an answer.

UPDATE2: The code was originally using readline but that was switched to sysread because the code already uses IO::Select and you cannot mix buffered reading with IO::Select

UPDATE3: This link explains why you cannot use buffered IO and IO::Select https://stackoverflow.com/questions/7349124/is-it-ever-safe-to-combine-select2-and-buffered-io-for-file-handles#7352605

Replies are listed 'Best First'.
Re: sysread() is deprecated on :utf8 handles
by ikegami (Patriarch) on Jul 21, 2017 at 03:19 UTC

    It makes no sense to use sysread on a file handle with the :utf8 layer because the read might have ended mid-character.

    sysread-using code should look something like this:

    binmode($fh); # Or open with :raw my $buf = ''; while (1) { # Or this could be a select loop. my $rv = sysread($fh, $buf, BLOCK_SIZE, length($buf); die($!) if !defined($rv); last if !$rv; # Identify and extract message. # Using LF-terminated messages in this example. while (s/^([^\n]*)\n//) { my $msg = $1; process_message($msg); } } die("Premature EOF") if length($buf);

    In your case, you just want to blindly forward everything you receive, so simply avoid decoding or encoding!

    binmode($fh_in); # Or open with :raw binmode($fh_out); # Or open with :raw while (1) { # Or this could be a select loop. my $rv = sysread($fh, my $buf, BLOCK_SIZE, length($buf); die($!) if !defined($rv); last if !$rv; print($fh_out $buf); }

      Thanks ikegami . Some of the other replies ignore the fact you cannot mix any buffered IO with select and as I put on my updates, I was aware of getting a buffer from sysread with part of a UTF-8 sequence. However, I got lost in the issue a bit and missed the obvious which as you point out is that the code doesn't really care what it reads from the socket as it is simply passing it on.

Re: sysread() is deprecated on :utf8 handles
by zentara (Archbishop) on Jul 20, 2017 at 11:46 UTC
    There seems to be similar bug reports containing the phrase "sysread() is deprecated on :utf8 handles" File::Slurp bug. If you read it there are some things to try.

    I'm not really a human, but I play one on earth. ..... an animated JAPH
Re: sysread() is deprecated on :utf8 handles
by Laurent_R (Canon) on Jul 20, 2017 at 14:06 UTC

      I appreciate that Laurent_R but the code cannot use buffered IO as it is using IO::Select which is why the code was changed to use sysread many years ago (apparently). sysread did/does support :utf8 although I realise it is unvalidated UTF-8.

      The problem I am seeking an answer to is how to continue using sysread without getting the deprecated warnings OR how to change the code to use some other unbuffered read that supports UTF8 layer.

      So far I have tried changing the layer on the read end of the socket to :unix (instead of UTF-8), reading the bytes and decoding them. This appears to work but has the caveate that if a part message was read the octets in it might fall in the middle of a UTF8 character.

        Hello again mje,

        Did you try the module that I recommend earlier?

        Regarding your question The problem I am seeking an answer to is how to continue using sysread without getting the deprecated warnings I do not recommend that but you can read more here Supressing warnings.

        Let us know what you have tried so far, provide us a sample of code to replicate the problem. We are just guessing for the moment.

        Hope this helps, BR

        Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: sysread() is deprecated on :utf8 handles
by thanos1983 (Parson) on Jul 20, 2017 at 11:44 UTC

    Hello mje

    You could use File::Slurper which does not have this problem. Source of information Bug 1425077 - Deprecated use of Slurp.

    I quote from the reference above:

    The best thing to do would be to backport the upstream patch that make +s Bugzilla use File::Slurper (which doesn't have the issue) instead o +f File::Slurp.

    I have not tested, so I can not say 100% will work but you can update us after applying it. :)

    Hope this helps, BR

    Seeking for Perl wisdom...on the process of learning...not there...yet!