http://www.perlmonks.org?node_id=926668

skazat has asked for the wisdom of the Perl Monks concerning the following question:

This may be a dumb question, but I'm stumped:

Say I have a email message I've created correctly in Perl, with a Content-Type charset of UTF-8, and an Content-Transfer-Encoding of, "8bit".

I want to pipe this message to sendmail.

What encoding do I set the pipe to sendmail to? Again, I'm assuming the message is in Perl's internal format and I've done everything else correctly when creating the message when it comes to decoding, etc.

I want to say I encode in UTF-8, but it would depend on if sendmail command handles things this way. Does it? My policy has been, "Assume everything is in UTF-8"

What if the Content-Transfer-Encoding was "quoted-printable"? Then, it seems I Wouldn't have to use encoding at all, since none of the email message (again, if I did everything corretly in creating it) would have a UTF-8 flag on it.

I've already confused myself with this simple train of thought, as encoding gets tricky.

-skazat

Replies are listed 'Best First'.
Re: encoding for sendmail
by zentara (Archbishop) on Sep 19, 2011 at 11:33 UTC

      Not explicitly setting UTF-8 encoding on the pipe filehandle will give the, "Wide character in print" warning.

      Data should always be encoded when leaving the prog, but there's really no way to know what to encode *to*, so you have to guess. I'm guessing that the sendmail command can deal with UTF-8 encoded data, but I don't know for sure. I think that's what I'm asking.

      I'm not sure what you want me to look at in this post:

      http://perlmonks.org/?node_id=692745 as it's actually been deleted and has some serious bugs in the example code anyways, and shouldn't be used.

      -skazat
        The normal fix for the "wide character' error is to binmode on the filehandle, in your case the pipe filehandle. I was going to mention that in my first reply, but thought it might mess with sendmail's line by line reading off the pipe.

        The node I referred you to, Sending a UTF-8 (Unicode) E-mail has not been deleted, a reply to it was deleted, probably some recent spam. I can't say whether the code is buggy, but it does demonstrate the problem of encoding the email for utf8.


        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh