Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Slashdot over the phone

by beretboy (Chaplain)
on Nov 25, 2001 at 02:33 UTC ( #127322=CUFP: print w/ replies, xml ) Need Help??

I am a daily reader of slashdot and I have been using this script and a developer account at TellMe to get the headlines over the phone when I am away from a computer. To get the headlines over the phone:

1.  Call Tellme at 1-800-555-Tell
2.  At the Tellme Menu prompt, dial 1-39746


The code that gets the headlines is:
#!/usr/bin/perl print "Content-Type: text/xml\n\n"; print "<vxml version='2.0'>\n"; print "<form>"; print "<block>\n"; open (SLASHDOT, "lynx www.slashdot.org/palm -dump|"); while (<SLASHDOT>) { if (/.*\d\..*|.*AD:.*|.*Stories continued.*|.*Home page.*|References/) + { print "\n"; } else { s/\[.*\]//g; print; print "."; # added so the headlines don't get slurred together } } print "</block>\n"; print "</form>\n"; print "</vxml>";
UPDATE: Removed CGI.pm, not needed

"Sanity is the playground of the unimaginative" -Unknown

Comment on Slashdot over the phone
Download Code
Re: Slashdot over the phone
by VSarkiss (Monsignor) on Nov 25, 2001 at 06:26 UTC

    A couple of comments.

    First, why are you bringing in CGI? You're not making any use of it at all. I could see using LWP instead of lynx -dump, but use CGI is just making your program bigger for no reason.

    Second, you're using dot-star unnecessarily in your regexes. In your if statement, you can omit them all and your results will be the same, but it will simplify your code (and make life easier for the regex engine).

    But I do agree that this is a pretty cool idea. jcwren did it for the monastery. Take a look at Stats Whoring - The VXML Way.

Sounds like a job for XSLT (tan-tan-tan-tan-tanaaa!)
by jaldhar (Vicar) on Nov 28, 2001 at 13:16 UTC
    While looking at this node, I thought to myself instead of trying to parse HTML dumped from lynx (possibly error-prone plus requires an external program,) why not use the RDF feed that slashdot provides at http://slashdot.org/slashdot.rdf? And as you would then be converting from one XML DTD to another, isn't this a perfect opportunity to use XSLT? Here's what I came up with:
    #!/usr/bin/perl -w use strict; # # A script to convert slashdot headlines to VXML. # (C) 2001, Jaldhar Vyas # Licensed under the Crowley Public License ("Do what thou wilt # shall be the whole of the license.") # use LWP::Simple qw(get); use XML::LibXML; use XML::LibXSLT; my $content = get('http://slashdot.org/slashdot.rdf'); unless (defined ($content)) # undef means something went wrong. { $content = <<'-EOT-'; <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <item> <title>Failed to retrieve headlines</title> </item> </rdf:RDF> -EOT- } my $xml = XML::LibXML->new(); my $xslt = XML::LibXSLT->new()->parse_stylesheet($xml->parse_fh(*DATA) +); print "Content-type: text/xml\n\n" , $xslt->output_string($xslt->transform($xml->parse_string($content))) +; __DATA__ <?xml version="1.0"?> <xsl:stylesheet xmlns="" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rss="http://my.netscape.com/rdf/simple/0.9/"> <xsl:namespace-alias stylesheet-prefix="#default" result-prefix='rss +' /> <xsl:output method="html" media-type="text/xml" indent="yes" /> <xsl:template match="rdf:RDF"> <vxml version='2.0'> <form><block> <xsl:apply-templates select="rss:item/rss:title" /> </block></form> </vxml> </xsl:template> <xsl:template match="rss:title"> <xsl:value-of select="." /><xsl:text>. </xsl:text> </xsl:template> </xsl:stylesheet>
    A problem I ran into was due to XML namespaces. RDF is built on the older RSS spec and most of the tags are actually declared in the RSS namespace. Other than that it was fairly simple and should be a lot more maintainable in the long run.
This cries out for XML::RSS or XSLT {Re: Slashdot over the phone}
by dave_aiello (Pilgrim) on Nov 28, 2001 at 20:23 UTC
    I agree with jaldhar and the comment he made in Sounds like a job for XSLT (tan-tan-tan-tan-tanaaa!), at least conceptually.

    The code in the original post only addresses Slashdot, or at best Slashcode based sites, and the implementation breaks if the Slashcode project authors ever change the way they format data for the Palm OS.

    A more leverageable solution would be to use XML::RSS to extract title entities from each item of a site's RSS file and rewrite the data in VoiceXML. This would allow you to apply your resulting script to any site that produces an RSS file. Syndic8.com knows about over 2,600 such sites or subsites at the moment.

    Another solution would be to leverage Using XSL Transformations to Produce VoiceXML. That's example 118 on the Tellme Studio web site.

    The original solution offered is interesting, but it is only readily understandable by people using UNIX. Although it is possible to run Lynx on PCs and even Macs, nobody I know is doing that. Furthermore, most developers on those platforms do not even know what Lynx is.

    Dave Aiello
    Chatham Township Data Corporation

Re: Slashdot over the phone
by beretboy (Chaplain) on Nov 28, 2001 at 22:09 UTC
    Your suggestions are good but I hestitate to implement them because I:
    1. dislike XML (for something this simple)
    2. don't even want to mess with RDF

    "Sanity is the playground of the unimaginative" -Unknown
Re: Slashdot over the phone
by rfb (Sexton) on Nov 28, 2001 at 22:46 UTC
    *grumble* stupid tellme making canadians call long distance
Re: Slashdot over the phone
by beretboy (Chaplain) on Dec 10, 2001 at 02:51 UTC
    Sorry I can't do anything about people from outside the US paying more :-(

    "Sanity is the playground of the unimaginative" -Unknown
This is very cool but......
by metadoktor (Hermit) on Dec 28, 2001 at 13:35 UTC
    it's hard to understand. You should add a pause or some text-to-voice that says "Article One blah blah blah", "Article Two blah blah blah", etc because otherwise it all runs together and is almost incomprehensible.

    But it still has a very high cool factor.

    Kudos!

    metadoktor

    "The doktor is in."

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://127322]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (13)
As of 2014-07-30 13:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (232 votes), past polls