Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: "wide character in print" error in DBM::Deep

by dragonchild (Archbishop)
on Mar 22, 2010 at 20:39 UTC ( #830158=note: print w/replies, xml ) Need Help??


in reply to "wide character in print" error in DBM::Deep

That's odd. There should be a UTF-8 test, but I've never used UTF-8 with DBM::Deep. In theory, you should be able to pass the filehandle in and there are tests for that. Per our email, please send me a failing test. the repos is at http://github.com/robkinyon/dbm-deep

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
  • Comment on Re: "wide character in print" error in DBM::Deep

Replies are listed 'Best First'.
Re^2: "wide character in print" error in DBM::Deep
by ikegami (Pope) on Mar 22, 2010 at 20:49 UTC

    In theory, you should be able to pass the filehandle in and there are tests for that.

    That fails too. Tested by changing

    my ($fh, $fn) = tempfile(); my $db = DBM::Deep->new( $fn );
    to
    my $fh = tempfile(); my $db = DBM::Deep->new( { fh => $fh } );
    in the test I provided earlier.
Re^2: "wide character in print" error in DBM::Deep
by ikegami (Pope) on Mar 22, 2010 at 21:32 UTC

    I did a bit of studying (DBM-Deep-1.0016).

    • write_value uses class DBM::Deep::Engine::Sector::Scalar for everything but references and undef.
    • ::Scalar::_init receives the value and passes it to print_at.
    • print_at expects a string of bytes. It's getting a string that contains non-bytes.

    No encoding is done anywhere, as far as I've seen. Definitely a major bug. Two possible fixes:

    • Have DBM::Deep::Engine::Sector::Scalar's _init encode values.
    • Add another Sector type for strings with UTF8=1.

    The latter should be simpler, more efficient, and allows the preservation of the UTF8 flag. Basically, adjust write_value and add

    package DBM::Deep::Engine::Sector::Unicode; use 5.006_000; use strict; use warnings FATAL => 'all'; no warnings 'recursion'; use base qw( DBM::Deep::Engine::Sector::Scalar ); sub type { $_[0]{engine}->SIG_UNICODE } sub _init { my $self = shift; utf8::encode( $self->{data} ) if $] >= 5.008 && defined($self->{data}); $self->SUPER::_init(); } sub data { my $self = shift; my $data = $self->SUPER::data(); utf8::decode( $data ) if $] >= 5.008; return $data; } 1; __END__

    And that's just for values. A separate fix is needed for the keys, I believe.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://830158]
help
Chatterbox?
[ambrus]: MLX: if it's a work email, then it's probably not Uncle Sam that matters, but what the account managing server at work thinks your name is. Those can differ. For example, we've had two co-workers with identical real name at one point,
[ambrus]: so one got a stupid suffix in the email account (people have email address based on their real name here usually).

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (14)
As of 2017-01-19 14:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you watch meteor showers?




    Results (170 votes). Check out past polls.