<?xml version="1.0" encoding="windows-1252"?>
<node id="592810" title="Re: HTML::Strip and UTF8 -- is there some way I can just skip all the &quot;UTF8 only&quot; entities?" created="2007-01-03 13:45:50" updated="2007-01-03 08:45:50">
<type id="11">
note</type>
<author id="396583">
tphyahoo</author>
<data>
<field name="doctext">
Answering my own question (partially), I think I have to do something along the lines of 
&lt;p&gt;
&lt;code&gt;
use strict;
use warnings;
use Encode::Encoder;

my $utf8String="\x{2019}";

my $latin1String = latin1ify($utf8String);
print "$latin1String\n";

sub latin1ify {
    my $string = shift || "";
    Encode::encode( "iso-8859-1" ,
                    Encode::decode_utf8($string)
                  );
}
&lt;/code&gt;


&lt;p&gt;
which gives "?" and then strip the question marks.
&lt;p&gt;
But I have to go now, so I'll finish this another time.</field>
<field name="root_node">
592806</field>
<field name="parent_node">
592806</field>
</data>
</node>
