Re^3: Why won't Perl convert (Latin1 | ISO-8859-1) to (UTF-8

in reply to Re^2: Why won't Perl convert (Latin1 | ISO-8859-1) to (UTF-8 | utf8)?
in thread Why won't Perl convert (Latin1 | ISO-8859-1) to (UTF-8 | utf8)?

Which is why I was so puzzled as to why the file(s) would take on the requested "MimeType" after changing that line.

From my experiments (and the documentation), file can only guess at the encoding by the presence or absence of specific codepoints. If your file has only those UTF-8 codepoints which also fit the Latin-1 encoding, perhaps it's all too happy to report the file as Latin-1. In my experience, iconv doesn't add a UTF-8 BOM to the start of the file. When I did that with Vim, file was a lot more specific.

Comment on Re^3: Why won't Perl convert (Latin1 \| ISO-8859-1) to (UTF-8 \| utf8)? Select or Download Code

Replies are listed 'Best First'.
Re^4: Why won't Perl convert (Latin1 \| ISO-8859-1) to (UTF-8 \| utf8)? by taint (Chaplain) on Jun 06, 2013 at 17:47 UTC
Greetings chromatic, and thank you for your reply. Indeed. file(1) depends on it's "Magic" file\|\|dir for it's response(s), and they aren't extremely concise. But, it was only after noticing that my editor wasn't reporting a change from iso-8859-1 to utf8, that I used file(1) to help me confirm that my editor had not stopped functioning reliably after all these years. I might also add, my editor also recognizes when a file that it has opened has been "touched" -- as do all modern editors. I'll probably have to try and create a filter using Encode.pm to pipe these files through. I was just hoping that those utilities that were already designed to perform just such tasks, would \| could accomplish this. Thanks again chromatic, for taking the time to respond. --chris #!/usr/bin/perl -Tw use perl::always; my $perl_version = "5.12.4"; print $perl_version;	[reply]

Replies are listed 'Best First'.

Re^4: Why won't Perl convert (Latin1 | ISO-8859-1) to (UTF-8 | utf8)?
by taint (Chaplain) on Jun 06, 2013 at 17:47 UTC

chromatic

Indeed. file(1) depends on it's "Magic" file||dir for it's response(s), and they aren't extremely concise. But, it was only after noticing that my editor wasn't reporting a change from iso-8859-1 to utf8, that I used file(1) to help me confirm that my editor had not stopped functioning reliably after all these years. I might also add, my editor also recognizes when a file that it has opened has been "touched" -- as do all modern editors.

I'll probably have to try and create a filter using Encode.pm to pipe these files through.

I was just hoping that those utilities that were already designed to perform just such tasks, would | could accomplish this.

#!/usr/bin/perl -Tw
use perl::always;
my $perl_version = "5.12.4";
print $perl_version;

[reply]

In Section Seekers of Perl Wisdom