Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Prlmnks.org code ready to be played with.

by EvdB (Deacon)
on Nov 14, 2005 at 22:02 UTC ( #508424=monkdiscuss: print w/ replies, xml ) Need Help??

Hello - ages ago I wrote prlmnks.org - a little site that lets you get RSS feeds to perlmonks.org proper. I have intended to make the code available to any and everyone to have a play with. Indeed I even created a project on sourceforge but then ran screaming at the horror that is their user interface...

Anyway the code is now in subversion on my server (?on subversion in my server?): http://svn.ecclestoad.co.uk/svn/prlmnks/trunk. It is currently read only but I am itching to let others go to work on it and improve it. Currently there is a bug to do with bad XML that crashes the scraping daemon (try DEBUG=1 ./fetch_node.pl 500633. I've also had reports that it produces bad XML. Both are probably encoding problems.

Unfortunately I really don't have time to look after it properly - please help! If you want to hack at the code then please let me know and I'll add you to the allowed users so you can commit. Please excuse the lack of docs - wrote it in a hurry.

Comment on Prlmnks.org code ready to be played with.
Download Code
Re: Prlmnks.org code ready to be played with.
by BUU (Prior) on Nov 15, 2005 at 11:20 UTC
    This is more of a prlmnks.org question, but how do you get 'newest nodes' from prlmnks? I see the various sections, and something called 'all nodes', but no 'newest noeds'?
        XML Parsing Error: undefined entity Location: http://prlmnks.org/rss/top.xml Line Number 878, Column 21:$thr1 = new Thread \&mysub; --------------------^
        Heh, I'll take a look tommorrow and see what can be done.
Re: Prlmnks.org code ready to be played with.
by jdporter (Canon) on Jun 15, 2006 at 19:21 UTC
    bad XML

    This is still happening. It's because node content is being embedded raw in the xml document. Many nodes have stuff in them that hoses xml. For example, this: </channel>. In effect, you're doing the same thing wrong that our on-site RSS feed generator does, or did, as discussed in rss feed corrupted by certain nodes.

    I'm also sad that your cache of pm nodes never gets updated. This means you never catch changes such as nodes getting retitled, moved to another section, or reaped.

    It also appears that you don't feed certain nodes, or types of nodes. For example, the Monastery Gates is not fed at all.

    Update: It also appears that prlmnks.org stopped updating its node cache from PerlMonks in October of 2006.

    We're building the house of the future together.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://508424]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2014-08-31 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls