Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Sending UTF-8 e-mail

by angus_w_chan (Initiate)
on Dec 10, 2004 at 21:07 UTC ( #413988=perlquestion: print w/replies, xml ) Need Help??
angus_w_chan has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I've been partially successful in sending e-mail encoded in UTF-8. I can successfully encode the subject and the body of the e-mail, however the To field get all messed up.

I get text like:

Instead of Japaense characters. (My e-mail client can accept japanese and UTF-8 character encodings)

Here's an exammple of what I am doing:
my $smtp = Net::SMTP->new( $smtp_server ); $smtp->mail( $self->smtp_from() ); $smtp->to( $self->smtp_to() ); $smtp->data(); $smtp->datasend( "To: " . MIME::Words::encode_mimeword($self->to(), 'Q +', 'utf-8') . "\n" ); $smtp->datasend( "From: " . MIME::Words::encode_mimeword($self->from() +, 'Q', 'utf-8') . "\n" ); $smtp->datasend( "Subject: " . MIME::Words::encode_mimeword($self->sub +ject(), 'Q', 'utf-8') . "\n" ); $smtp->datasend("MIME-Version: 1.0\n"); $smtp->datasend("Content-Transfer-Encoding: 8bit\n"); $smtp->datasend("Content-Type:text/plain; charset=utf-8\n"); print $self->body(); $smtp->datasend( "\n" ); $smtp->datasend( $self->body() ); $smtp->dataend();

Does anyone have any ideas?

Replies are listed 'Best First'.
Re: Sending UTF-8 e-mail
by dakkar (Hermit) on Dec 11, 2004 at 16:39 UTC

    You are almost doing it right. Your biggest problem is that MIME::Words does not work exactly as needed.

    First of all, I assume you are dealing with character strings, not byte strings (read perluniintro and perlunicode for details). In that case, you should really use Encode::encode on your strings before passing them to MIME::Words::encode_mimeword.

    The problem with the from and to fields (yes, they'll both get messed up) is that RFC-1522 poses some constraint on the encoding of the various parts. The easiest way to get near what the RFC requires is to use MIME::Words::encode_mimewords:

    $smtp->datasend( "To: " . MIME::Words::encode_mimewords(Encode::encode +('utf-8',$self->to()), Encoding => 'Q', Charset=> 'utf-8') . "\n" );

    This way I was able to send myself a message with all headers containing japanese characters.

    On a sidenote: the RFC requires each encoded "word" to be less than 75 octects. Keep this in mind when encoding long subject lines.

            dakkar - Mobilis in mobile

    Most of my code is tested...

    Perl is strongly typed, it just has very few types (Dan)

Re: Sending UTF-8 e-mail
by gaal (Parson) on Dec 10, 2004 at 21:51 UTC
    First, are you sure your mail client handles RFC 1522-encoded headers correctly?

    Does round-trip encoding work as you expect?

    print MIME::Words::decode_mimeword( MIME::Words::encode_mimeword($self->to, 'Q', 'utf-8'));

    See the warning about encode_mimewords (plural) not being RFC 1522 compliant. Try this with a single word first, and then with several words.

Re: Sending UTF-8 e-mail
by Joost (Canon) on Dec 10, 2004 at 21:26 UTC
    Are you sure email headers may have an UTF-8 encoding? IIRC you can only use ASCII in headers.

    this site at least claims so:

    In general, email headers must contain only US-ASCII characters. Headers that contain non US-ASCII characters must be encoded so that they contain only US-ASCII characters. This process involves using either "B"(BASE64) or "Q"(Quoted-Printable) to encode certain characters.
    If you could use an alternative encoding in the headers, there would be the slight problem of how to read the "Content-type" header if it could be in any encoding? :-)

      That's what he's using MIME::Words for -- ASCII-armouring UTF-8 using either Q or B. As this armour contains the encoding name, headers of a MIME mail could be in any encoding not even similar to what it specified in Content-Type: header. Content-Type is for bodies (and parts thereof).
Re: Sending UTF-8 e-mail
by kappa (Chaplain) on Dec 11, 2004 at 13:06 UTC

    Try the same with shorter values. Really. MIME::Words is one of the ugliest hacks on CPAN (as is the underlying RFC). The encoded string should be at most 75 bytes long so you'll need to either split original (remember, Quoted-Printable multiplies the size in bytes by three and UTF-8 Japanese chars are more than one byte long) or employ MIME::Words::encode_mimewords (note plural) which will eat your spaces and do other interesting things.

    Try to use the latest MIME-Tools package (dev version 6.0), it's more correct in this very place. Or import MIME/ from OpenWebMail which is almost perfect and has the same interface.

    I'll be able to help you more if you show mail headers as they are on receiving side. I've invested days of blood and sweat in MIME encoding of mail headers once :)


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://413988]
Approved by atcroft
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2018-05-20 20:44 GMT
Find Nodes?
    Voting Booth?