Laziness through CPAN: Screen-scrape to RSS with 3 Modulesby crashtest (Curate)
|on Dec 21, 2009 at 02:44 UTC||Need Help??|
I'm not sure this is a cool use for Perl, but once again, I am astounded by how easy the easy things really are. The script below is one of those where you're almost surprised when you're done writing it. "That's it?", you ask yourself. Yes, that's it.
Here's the background: there's a certain trail race that I'd like to run, but there are always more applicants than slots, so the organizers have resorted to a lottery system to pick entrants this year. Unfortunately, I didn't get in, but I am in the top 25 on the wait list.
The lottery winners have until midnight tonight to pay their entry fee - otherwise the wait-listed people move into their slots. On the lottery page, it is clearly indicated who has, and who hasn't, paid their entry fee yet. Now I could obsessively sit at my computer, refresh the page every five minutes and count the "Not Paid" entrants... or I could be obsessive and lazy, and enlist Perl for help.
With just three use directives, I'm in business:
And now in 50 non-optimized lines, I can easily write a script that screen-scrapes the web page (using LWP::UserAgent), counts the people who've paid and those who haven't (via HTML::TableParser), then print a simple RSS file (with XML::RSS) to a web-accessible spot that I've now added to my News Reader application (Google Reader).
The script is scheduled via cron. Since I can check my news reader on my phone, I am free to walk around, eat dinner etc. while tracking something I have absolutely no control over. Perfect!
I've done something like this before, in order to track the waiver wire in a fantasy league. But I am struck by how easy this really was, and totally worthwhile even though I can put this script in the trash after midnight.
I've also thought that this basic process - scrape -> parse -> post - can be implemented in thousands of ways using many other tools and technologies. Have other monks done similar things in the past? How would you have approached my problem?