Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^4: Is there a simple way to archive/download all of PerlMonks?

by marto (Cardinal)
on Apr 29, 2024 at 09:36 UTC ( [id://11159142]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Is there a simple way to archive/download all of PerlMonks?
in thread Is there a simple way to archive/download all of PerlMonks?

I didn't check all of the domains, no. I think the only valid archive would be an up to date database extract of node content (and some of the other metadata), rather partial snapshots of page impressions from a moment in time.

Update: I seem to recall different domains having different robots.txt rules to impact indexing.

  • Comment on Re^4: Is there a simple way to archive/download all of PerlMonks?

Replies are listed 'Best First'.
Re^5: Is there a simple way to archive/download all of PerlMonks?
by LanX (Saint) on Apr 29, 2024 at 13:15 UTC
    Otherwise I'd poll the XML per node.

    If edits are/were reflected in the timestamps of the http headers, this could also be quite efficient in fetching updates.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11159142]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2026-04-20 12:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.