Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things


by ajt (Prior)
on Mar 24, 2003 at 21:40 UTC ( #245550=note: print w/replies, xml ) Need Help??

in reply to XML::RSS


Malformed XML is the bane of RSS. According to Mark Pilgrim about 10% of typical RSS feeds are malformed*, indeed the UK IT publication The Register has usable XML for only a few days in a given month.

You will find a wide range of problems that will cause XML::Parser the core of XML::RSS to explode:

  • Data encoded in one format, but declared in another (or in default utf-8).
  • Junk before the start XML declaration, the CMS Vignette tends to do this, and it's popular with big companies.
  • Badly nested tags, the CMS is sloppy at non-well formness checking, so it comes out and goes into the RSS feed broken.
  • Inproperly escaped ampersands and entities are a very common problem too.

In this node "How do I clean RSS feeds to make them usable?", Matts suggested his rssmirror, the guts of which are now included in both XML::RSS and XML::RSS::Tools.

I became so annoyed with bad XML in RSS feeds that I wrote XML::RSS::Tools to deal with the problems I found, which led to brian d foy taking over XML::RSS fixing a lot of it's problems, and with time designing a whole new version.

See also:

Good Luck!

* Parsing RSS At All Costs


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://245550]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2018-06-21 17:20 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (118 votes). Check out past polls.