Welcome to the Monastery | |
PerlMonks |
Unmangle RSS encodingsby qq (Hermit) |
on Jun 30, 2004 at 20:30 UTC ( [id://370892]=perlquestion: print w/replies, xml ) | Need Help?? |
qq has asked for the wisdom of the Perl Monks concerning the following question: OT, but I will solve it with perl if I solve it at all. Is there any way to make sense of an RSS file thats been incorrectly encoded? The following line appears in this feed.
Which should read: "India A's Zimbabwe tour" This being RSS with no encoding specified, its officially utf8. But it's obviously been html-encoded at some point. Can anybody explain to me how to figure out what happended to the string above? Can it be undone? Or, as is my current inclination, should I just exclude any feeds/items that I find html-entities in? I've a lot of feeds to parse, so I need a solution that doesn't require hand examining each feed. thanks again, qq
Back to
Seekers of Perl Wisdom
|
|