Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

I am working on a project where I need to be able to extract attachments from an email that have been added in Simplified Chinese. The problem is that when they are extract to the file system I end up with names like ????.doc!

I put together a little test script to show what I mean:

#!/usr/bin/perl use MIME::Parser; use MIME::Parser::Filer; my $tempdir = "extract"; ( -d $tempdir) or mkdir $tempdir, 0755 or die "mkdir: $!"; my $parser = new MIME:arser; $parser->output_under("/home/uxbod/extract"); $parser->extract_uuencode(1); $entity = $parser->parse_open("/home/uxbod/testmessage"); foreach my $part ($entity->parts_DFS) { next if (!$part->bodyhandle); my $rec_filename = $part->head->recommended_filename; my $filename = $part->bodyhandle->path; print "Recommended: $rec_filename Alternative : $filename\n"; } $parser->filer->purge; rmtree $tempdir;

and when this runs I see the following output:

[uxbod@gateway ~]# ./testextract.pl ignoring text in character set `GB2312' at /usr/share/perl5/MIME/Parser/Filer.pm line 659 ignoring text in character set `GB2312' at /usr/share/perl5/MIME/Parser/Filer.pm line 659 Recommended: =?gb2312?B?MzYw0MLOxbzgsuItMTItMDEtQ2hpIFNpbXAudHh0?= Alt +ernative : /home/uxbod/extract/msg-1321526988-4755-0/1 Recommended: =?gb2312?B?MzYw0MLChLFPnHktMTItMDEtQ2hpIFRyYWQudHh0?= Alt +ernative : /home/uxbod/extract/msg-1321526988-4755-0/1-1

As you can see the last two MIME entities are encoded using gb2312 but how can I get that to be the correct name on the file system ? If I extract the file through an email client and transfer it across to that system it does look okay:

-rw-r--r-- 1 uxbod uxbod 34304 Nov 15 10:42 撰稿材料.doc

Any help would be very very much appreciated.


In reply to MIME::Parser::Filer and filenames in Simplified Chinese by uxbod

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2024-04-24 09:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found