http://www.perlmonks.org?node_id=742394


in reply to Re^2: Character encoding of microns
in thread Character encoding of microns

Hi,

Am i correct in assuming that the oracle encoding WE8ISO8859P1 is actually ISO-8859-1? In that case, am i also correct in assuming that perl automatically writes data as ISO-8859-1?

Even if i decode ('ISO-8859-1',$clob); i still get question marks written for microns.

I just tried a little experiment - in Notepad++ i wrote a single micron sign (Alt-0181). That displayed fine when the encoding is ANSI. When i changed it to utf-8, i got a box/splodge. When i open my actual file, and change the encoding from ANSI to utf-8, nothing happens. This is interesting, is it not?

This problem is beginning to bug me now :).

Any help appreciated.

Joe

UPDATE---

clob: 74:68:69:73:20:69:73:20:73:74:72:69:6E:67:20:77:69:74:68:20:C2:B +5:20:69:6E:20:69:74 -- byte conv: 74:68:69:73:20:69:73:20:73:74:72:69:6E:67:20:77:69:74:68:20:C2:B +5:20:69:6E:20:69:74 -- utf8 unix perlio clob: 'this is string with ยต in it' conv: 'this is string with ต in it' unix perlio encoding(utf8) utf8 clob: 'this is string with รยต in it' conv: 'this is string with ยต in it'
That is the output of oshalla's code. It would seem that the first decode as utf8 seems to make it work, as long as you dont binmode stdout. after binmode the strange As start to appear.

However, this is fine for this test string. But, my database output still has question marks in place of the micro signs

update 2 i wrote a little c# program to grab the output from oracle and write it to a file. This had no problem and worked fine. In perl Binmode on stdout didnt affect anything and neither did use encoding 'utf8'

any help appreciated guys

-- joe

---

Eschew obfuscation, espouse eludication!