I'm certainly not claiming that this is the best way to do it (and I'm not really claiming anything at all...even I don't use this anymore)...but I once put in place an example of how to deal with either XML or HTML (via HTML::TableExtract) that worked pretty well, for the task and the time:
PerlMonks::StatsWhore.
As I said, that whole effort has fallen way by the wayside. I'm curious to see what you come up with in the sense of layering a common interface over the different methods of retrieval/parsing on the back end.
Cheers,
Matt