bjelli has asked for the wisdom of the Perl Monks concerning the following question:
Dear fellow monks
i'm trying to use perl(5.8) + dbi(1.37) + dbd::mysql(2.1026) + mysql(4.1.0-alpha) with unicode.
as far as i can tell i can write a utf8 string into the database, and get back the same sequence of bits, only now it's a 'classical' perl-string, not flagged as utf-8.
the string i write into the db is 6 characters long: "ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}"
character unicode utf8 hex binary A 0041 01000001 B 0042 01000010 c 0063 01100011 greep alpha 03B1 1100111010110001 german scharfes s 00DF 1100001110011111 cyrrillic e 044D 1101000110001101what i get back from the db is
A 01000001 B 01000010 c 01100011 ? 11001110 ? 10110001 ? 11000011 ? 00111111 ? 11010001 ? 00111111
I have tried to convert this using $new = decode_utf8( $fromdb ); but all i get is an empty string. is there some way to find out why this won't decode?
or is my debugging stuff that shows me the bits in the string just wrong:
sub showbits { my ($template, $utf, $result, $i); $utf = is_utf8 $_[0]; $template = $utf ? "U*" : "C*"; foreach ( unpack($template, $_[0] ) ) { $result .= "\n" ; $result .= substr( $_[0], $i, 1 ) . "="; $result .= sprintf ("%04X", $_) . "="; if ( $utf and $_ > 127) { $b = unpack("B*", substr( $_[0], $i, 1 )); } else { $b = unpack("B*", pack("N", $_ )); } $b =~ s/^0{32}//; # leading zeros $b =~ s/^0{16}//; $b =~ s/^0{8}//; $result .= $b; $i++; } return $result; }
-- Brigitte 'I never met a chocolate I didnt like' Jellinek http://www.horus.com/~bjelli/ http://perlwelt.horus.at
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: unicode (and mysql)
by zby (Vicar) on Jun 16, 2003 at 15:05 UTC | |
Re: unicode (and mysql)
by PodMaster (Abbot) on Jun 16, 2003 at 14:12 UTC | |
Re: unicode (and mysql)
by yosefm (Friar) on Jun 16, 2003 at 18:48 UTC | |
Re: unicode (and mysql)
by bjelli (Pilgrim) on Jun 17, 2003 at 09:16 UTC | |
by zby (Vicar) on Jun 17, 2003 at 09:33 UTC | |
by yosefm (Friar) on Jun 17, 2003 at 10:01 UTC | |
Switching strings' UTF-8 bits under older Perls
by andrewc (Acolyte) on Jun 17, 2003 at 12:00 UTC |
Back to
Seekers of Perl Wisdom