That was easy enough. Thanks! Of course it still doesn't work because some characters don't appear to be UTF-8.
I wrote this code to try to figure out what encoding it is, but for any file with the "special" characters I just get the "Didn't work" message.
#!/usr/bin/perl
use strict;
use warnings;
use Encode::Guess;
undef $/; # slurp on
my $dir = '.';
if (@ARGV > 0) {
$dir = $ARGV[0];
}
opendir DIR, $dir or die "Can't opendir '.': $!\n";
my @files = grep /\.csv$/i, readdir(DIR);
closedir DIR;
Encode::Guess->add_suspects(qw(latin1 cp1252)); # What else?
foreach my $file (@files) {
open my $fh, "<:raw", "$dir/$file"
or die "Can't open $!\n";
my $data = <$fh>;
close $fh;
my $enc = guess_encoding($data);
if (ref $enc) {
print "$file: " . $enc->name . "\n";
} else {
print "Didn't work for: $file\n";
}
}
exit;
This file was generated on Windows by exporting from Outlook. The Windows is setup for American English, but the keyboard is Danish. :-/ All the files that DO work are reported with "ascii" encoding (as I expect).