http://www.perlmonks.org?node_id=502117

schweini has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,
My ASCII-fu has been rusting away for a couple of years now, so I'm a bit confused about a problem i ran into yesterday:
I was sending spanish characters like 'αινσϊ ρρρρ' via cups to a samba-connected Epson LX-300 dot-matrix printer (in 'lpr -oraw' mode). The printer just printed some weird line-drawing-like characters, and no matter how I tried to adjust the character-tables in the printer (I might have done it wrong, though), no spanish characters were printed.
so I hacked up a little print PRN "$_ = ".chr($_)."\n" for (97 .. 255) script, and the characters actually DID appear this way on the printer, but on the linux-console the exact same ASCII codes represented different characters. I ended up just putting a s/ρ/chr($codes{'ρ'})/ge before the part that sends text to the printer, but this seems as a rather suboptimal solution.
So, what 'codepage' does linux or perl usually use (windows could print those characters just fine in raw printer mode)? does this have anything to do with unicode or those fancy I/O layers that open() now supports?
thanks in advance,
-schweini

Replies are listed 'Best First'.
Re: trouble printing spanish characters
by thundergnat (Deacon) on Oct 21, 2005 at 22:48 UTC

    Most likely, you'll need to change the text encoding before you print it. The Linux codepage is configurable, though iso-8859-1 is probably most common in English speaking countries.

    Going to a Samba share, you'll likely need to re-encode the string into codepage 437; the standard DOS code page.

    use Encode; my $string = 'αινσϊ ρρρρ'; my $enc_string = encode('cp437', $string); print PRN $enc_string;
      That might not work, depending on the particular editor (locale, whatever) that is being used to write the perl script. If it doesn't save those accented characters as utf8, the assignment to $string -- or rather, the encoding of $string into cp437 -- will fail.
Re: trouble printing spanish characters
by graff (Chancellor) on Oct 22, 2005 at 00:05 UTC
    Where are your Spanish characters actually coming from? A data file? A web page? User input? How much do you know about your input data? (E.g. would you be able to say or find out what character encoding is being used when the data first comes into your script?)

    If you're getting "line drawing characters" at the printer, your script might be sending the characters as utf8, and in that case, using the Encode module (and its "encode()" function), as shown in the first reply, is likely to be the right way to go -- though actually, assuming that perl is initially storing your strings internally as utf8, something like this would be simpler:

    binmode( PRN, ":encoding(cp437)" );
    By setting a PerlIO layer on your PRN file handle to apply the character encoding transfor for you (from utf8 to cp437), you don't need to do anything else -- you don't need "use Encode", you don't need any substitutions or other alterations of your string data.

    Your statement of the problem was vague enough that I can't be sure this is the right answer, but this is a place to start, at least.

      i'm getting the spanish characters via CGI, DBI or even hardcoded into the script - but those 'wrong' characters appeared even when i was just echoing to lpr from bash - looked fine on the screen, came out wrong on the printer.
      thanks a lot for the PerlIO suggestion - something like that was what i was looking for. I can't test it now, but i got a hunch that that was the problem. Thanks again.