Re: XML::Twig modify data, and I don't want that (preserve entities keep_encoding)

by Anonymous Monk
on Sep 16, 2013 at 11:05 UTC

in reply to XML::Twig modify data, and I don't want that


thats called an "entity" so perldoc XML::Twig | grep entity


Re^2: XML::Twig modify data, and I don't want that (preserve entities keep_encoding)
by physi (Friar) on Sep 16, 2013 at 11:56 UTC
    Thanks, but I can't figure out, how this helps?
    Or is the short answer: "It's not possible!" ?
    --the good, the bad and the physi--


      I believe the AM means that you can use the keep_encoding option to retain your original string data.

      However, I think that the problem is that you're specifying the encoding as ISO-8859-1. If I understand properly, that means that you're telling XML::Twig that there's no Unicode data in your input. However, you have an entity in there. Since the ampersand is special, XML::Twig is properly escaping that so that when it decodes in the future, that it generates the proper output.

      If you specify a unicode encoding, I expect that XML::Twig will then read the string as a unicode character, and then emit it properly when it rewrites the file.

      Disclaimer: What I know about unicode you can write on a pinhead with lipstick.


      When your only tool is a hammer, all problems look like your thumb.

        Thanks roboticus.

        Same misunderstanding of unicode on my side :-)
        but even if I delete the encode line, I get the same result. I think this might be end in not using keep_encoding and then susbstitute the resulting Umlaute like with their encode_entities_numeric Values :-(

        --the good, the bad and the physi--

