http://www.perlmonks.org?node_id=830158


in reply to "wide character in print" error in DBM::Deep

That's odd. There should be a UTF-8 test, but I've never used UTF-8 with DBM::Deep. In theory, you should be able to pass the filehandle in and there are tests for that. Per our email, please send me a failing test. the repos is at http://github.com/robkinyon/dbm-deep

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
  • Comment on Re: "wide character in print" error in DBM::Deep

Replies are listed 'Best First'.
Re^2: "wide character in print" error in DBM::Deep
by ikegami (Patriarch) on Mar 22, 2010 at 20:49 UTC

    In theory, you should be able to pass the filehandle in and there are tests for that.

    That fails too. Tested by changing

    my ($fh, $fn) = tempfile(); my $db = DBM::Deep->new( $fn );
    to
    my $fh = tempfile(); my $db = DBM::Deep->new( { fh => $fh } );
    in the test I provided earlier.
Re^2: "wide character in print" error in DBM::Deep
by ikegami (Patriarch) on Mar 22, 2010 at 21:32 UTC

    I did a bit of studying (DBM-Deep-1.0016).

    • write_value uses class DBM::Deep::Engine::Sector::Scalar for everything but references and undef.
    • ::Scalar::_init receives the value and passes it to print_at.
    • print_at expects a string of bytes. It's getting a string that contains non-bytes.

    No encoding is done anywhere, as far as I've seen. Definitely a major bug. Two possible fixes:

    • Have DBM::Deep::Engine::Sector::Scalar's _init encode values.
    • Add another Sector type for strings with UTF8=1.

    The latter should be simpler, more efficient, and allows the preservation of the UTF8 flag. Basically, adjust write_value and add

    package DBM::Deep::Engine::Sector::Unicode; use 5.006_000; use strict; use warnings FATAL => 'all'; no warnings 'recursion'; use base qw( DBM::Deep::Engine::Sector::Scalar ); sub type { $_[0]{engine}->SIG_UNICODE } sub _init { my $self = shift; utf8::encode( $self->{data} ) if $] >= 5.008 && defined($self->{data}); $self->SUPER::_init(); } sub data { my $self = shift; my $data = $self->SUPER::data(); utf8::decode( $data ) if $] >= 5.008; return $data; } 1; __END__

    And that's just for values. A separate fix is needed for the keys, I believe.