Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Net::LDAP and utf-8 issues

by Skeeve (Vicar)
on Jul 12, 2013 at 06:49 UTC ( #1043887=perlquestion: print w/ replies, xml ) Need Help??
Skeeve has asked for the wisdom of the Perl Monks concerning the following question:

I wrote a small script based on Net::LDAP and it seemed to work quite good, until I found an entry giving me, upon printing it: "Wide character in print ...".

I checked the entry using Eclipse and LDAP Browser and this entry seems to be messed up as even that couldn't display it properly.

So now I created a new entry containing some german umlauts and some accented characters.

My script printed them perfectly.

Then I added binmode STDOUT, ':utf8'; and now the warning is gone, but my strings are displayed completely wrong.

What's your suggestion for solving this problem?

I think, I need to fix (delete) the defect entry, but then what about my UTF-8 strings? Why are they displayed correctly when I do not use binmode STDOUT, ':utf8'; but get all scrambled up, when I use it?


s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Comment on Net::LDAP and utf-8 issues
Select or Download Code
Re: Net::LDAP and utf-8 issues
by kcott (Abbot) on Jul 12, 2013 at 07:12 UTC

    G'day Skeeve,

    If you have Unicode characters in your source code, you'll need to use the utf8 pragma. Also look at the raw option for Net::LDAP's new() constructor and search() method. If that's of no help, you'll need to show your code: I'm completely guessing here.

    -- Ken

      Thanks for guessing ;)

      I didn't show the code as there is really nothing special about it - at least as far as I can see. But I will try to shrink it down to the bare minimum and post it later.

      What I forgot to mention is: I already tried the "raw" option, but it did not make any difference.


      s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
      +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Net::LDAP and utf-8 issues
by Skeeve (Vicar) on Jul 15, 2013 at 08:44 UTC

    Just for completeness. I think I found the reason for the problem. See below.

    So here is a small example of my script having problems with utf-8.

    use strict; use warnings; use Net::LDAP; # binmode STDOUT, ':utf8'; # <- If this is active, output is corrupted my $ldap_host= '...myhost...'; my $ldap_port= ...myport...; my $ldap_user= '...myuser...'; my $ldap_pass= '...mypass...'; my $ldap = Net::LDAP->new( $ldap_host, port => $ldap_port, raw => qr/(?i:^jpegPhoto|;binary)/, ) or die "$@"; my $mesg = $ldap->bind( $ldap_user, password => $ldap_pass, ); if ($mesg->is_error) { die $mesg->error_text; } $mesg= $ldap->search( base => '...mybase...', filter => 'uid=...myuser...', attrs => [ qw/ cn / ] ); if ($mesg->is_error) { die $mesg->error_text; } my $result= $mesg->as_struct; while (my($id, $data)= each %$result) { print $data->{'cn'}->[0], "\n"; } $ldap->unbind;

    Output is okay without the binmode STDOUT, ':utf8';:

    $ ./ldap-test.pl Jrg $ ./ldap-test.pl | od -xc 0000000 f64a 6772 000a J 366 r g \n 0000005
    But with that line I get:
    $ ./ldap-test.pl Jörg $ ./ldap-test.pl | od -xc 0000000 c34a 72b6 0a67 J 303 266 r g \n 0000006

    Assumption: The UTF-8 Conversion is okay, but my Cygwin rxvt Terminal is too dumb. It's not expecting UTF-8 but some WinDOS encodeing.

    When I switch to an xterm and enable UTF-8 encoding there, output seems okay.


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1043887]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2015-07-03 23:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (57 votes), past polls