Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: incremental reading of utf8 input handles

by Khen1950fx (Canon)
on Jul 07, 2012 at 00:33 UTC ( #980404=note: print w/replies, xml ) Need Help??

in reply to incremental reading of utf8 input handles

Here's a test that I tried. The first two tests fail as they should. The last 2 tests succeed.
#!/usr/bin/perl -l BEGIN { $| = 1; $^W = 1; $ENV{'TEST_VERBOSE'} = 1; } use strict; use warnings; use Test::utf8; use Test::More tests => 4; use Encode qw/:all/; my $invalid = "\x{e9}"; Encode::_utf8_on($invalid); ok(is_valid_string($invalid)); my ($buffer, $string) = ('', ''); while (read $invalid, $buffer, 256, length $string) { $invalid .= decode( 'utf-8-strict', $buffer, Encode::FB_QUIET ); } Encode::_utf8_on($string); ok(is_valid_string($string));
The $buffer should hold any partial incrementation.

Replies are listed 'Best First'.
Re^2: incremental reading of utf8 input handles
by Tanktalus (Canon) on Jul 09, 2012 at 17:06 UTC

    This looks very interesting. Thanks. Unfortunately, your test doesn't seem to work here. I added a diag "[", explain($string), "]"; to the end, and I get no output. (i.e., an empty []). Also, I tried adding a "diag '.';" inside the while loop to see how many times it loops, and nothing came out. You're also not reading from $invalid, you need to open my $fh, '<', \$invalid; and then you can read from $fh. But though it now reads one time, the length of the ouput still seems to be zero. I'll see if I can adapt this test to actually have valid utf8 after multiple reads and see what comes of it. Somewhere to start from anyway :-)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://980404]
and monks are getting baked in the sun...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2018-03-19 03:26 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (232 votes). Check out past polls.