Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re^2: UTF-8 to Latin1 - unmatched characters?

by uncommon13 (Novice)
on Mar 27, 2008 at 15:08 UTC ( #676753=note: print w/replies, xml ) Need Help??

in reply to Re: UTF-8 to Latin1 - unmatched characters?
in thread UTF-8 to Latin1 - unmatched characters?

Thanks Sam. I used this, however, it also converts the other valid latin1 characters to ASCII.

So, I found this which converts non-matched UTF-8 characters to something:

So basically, the code would be something like:

# Converted UTF codes for non-matching ISO-8859-1 # Strip it down to basic ASCII %utf_entity = ( "\x{2019}", "'", "\x{201c}", '"', "\x{201d}", '"', "\x{2026}", "...", "\x{fffd}", "", ); s/(\X)/ exists $utf_entity{$1} ? $utf_entity{$1} : $1 /eg;

Replies are listed 'Best First'.
Re^3: UTF-8 to Latin1 - unmatched characters?
by ikegami (Pope) on Mar 27, 2008 at 17:41 UTC
    I was going to recommend passing only characters that don't exist in iso-latin-1 to unidecode using a fallback handler to encode. It works, but I'm getting an error (Close with partial character.) when the file handle is closed, and I have no idea how to fix it.

    Here's the code anyway:

    use strict; use warnings; use PerlIO::encoding qw( ); use Text::Unidecode qw( unidecode ); use constant FB_UNIDECODE => sub { unidecode(chr($_[0])) }; my $file = '...'; local $PerlIO::encoding::fallback = FB_UNIDECODE; open(my $fh, '>:encoding(iso-8859-1)', $file) or die("Unable to create file \"$file\": $!\n"); print $fh "abc\x{201C}def\x{2013}ghi";
      Dear ikegami,

      This is exactly what I wanted :)

      It's absolutely brilliant to think about using the fallback handler.

      I don't get the error (Close with partial character.) which u mentioned though.

      Many thanks again :)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://676753]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2019-06-18 13:50 GMT
Find Nodes?
    Voting Booth?
    Is there a future for codeless software?

    Results (82 votes). Check out past polls.

    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!