I thought of doing that myself as well. You could of course set up a cron job to just fetch the Selected Best Nodes to a timestamped file. I've set up the following cron job to archive that page into an SQLite database. It then prints out a reputation-sorted list of what it has already archived. That way, after a while, I'll have my own Top 5000 list.
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TableExtract;
use LWP::Simple;
use DBI;
my $db_file = "best_nodes";
my $pm_site = "http://perlmonks.org/index.pl?node_id=%d";
my $make_table = ! -f $db_file;
my $dbh = DBI->connect("dbi:SQLite:dbname=$db_file", "", "")
or die "Can't connect to db: $DBI::errstr";
$dbh->do( qq[
create table nodes (
id int unique,
title varchar(255),
auth_id int,
author varchar(255),
rep int
)
]) if $make_table;
my $html = get( sprintf $pm_site, 328478 );
my $te = HTML::TableExtract->new(
headers => [ qw/Node Author Rep/ ],
keep_html => 1
);
$te->parse($html);
foreach my $row ($te->rows) {
my ($node, $author, $rep) = @$row;
my ($id) = $node =~ /\?node_id=(\d+)/;
my ($auth_id) = $author =~ /\?node_id=(\d+)/;
($rep) = $rep =~ /(\d+)/;
my ($title) = $node =~ m{>(.+?)</a>$};
($author) = $author =~ m{>(.+?)</a>$};
$dbh->do("delete from nodes where id=?", undef, $id);
$dbh->do("insert into nodes values (?,?,?,?,?)", undef,
$id, $title, $auth_id, $author, $rep);
}
my $sth = $dbh->prepare( qq[
select id,title,auth_id,author,rep from nodes order by rep desc
]);
$sth->execute;
open my $fh, ">bestnodes.html" or die;
print $fh "<table>\n";
while (my ($id, $title, $auth_id, $author, $rep) = $sth->fetchrow_arra
+y) {
$id = sprintf $pm_site, $id;
$auth_id = sprintf $pm_site, $auth_id;
print $fh qq[
<tr><td><a href="$id">$title</a></td>
<td><a href="$auth_id">$author</a></td>
<td>$rep</td></tr>
];
}
print $fh "</table>\n";
Incidentally, this is my first experience with HTML::TableExtract, and it's just perfect for this job. Maybe I'll post the best nodes archive on my homepage once it gets big enough.
| [reply] [Watch: Dir/Any] [d/l] |
Believe it or not I actually already coded XML support into the patches I wrote for the Best/Selected nodes. While the other users XML ticker uses this the other pages don't. (Yes internally other users are technically "picked_nodes") But the support is there and if the gods are amenable and I get the tuits to add the rest of the code then youll be able to get an XML feed of this data instead of scraping the HTML for it.
I meant to release the final patches for the XML stuff right after the normal HTML changes went live, but I guess I got a distracted with other things. Sorry. :-)
---
demerphq
First they ignore you, then they laugh at you, then they fight you, then you win.
-- Gandhi
| [reply] [Watch: Dir/Any] [d/l] |
| [reply] [Watch: Dir/Any] [d/l] |
| [reply] [Watch: Dir/Any] |
Go to tilly, click on the write-ups, sort by "highest reputation first". Even tilly hasn't written more than 100 of the best 500 nodes :)
| [reply] [Watch: Dir/Any] |
What would be the point? They're randomly selected.
--
I'm not belgian but I play one on TV.
| [reply] [Watch: Dir/Any] |
| [reply] [Watch: Dir/Any] |
On my 18th birthday (legal gambling age), I went into a betting shop and made 2 bets. I had to do something "new" to celibrate that special birthday, but I'd previously celebrated my 16th, in a pub with my girlfriend and the landlord, so drinking was passe.
The first bet was 25p each-way. It came in first and I tripled my stake.
The second, I bet my winning, 50p on-the-nose. It came in second. I was even.
That was the last time I bet on a horse. I'm a gambler that never lost any money:)
I've never yet bought a lottery ticket.
I've enjoyed fruit machines occasionally, but I bore easily, and with modern ones you don't even have the fun of pulling the handle.
| [reply] [Watch: Dir/Any] |
The "point" was I just wanted to know if it existed.
The reason is, a few days ago I followed a link (from one of tilly's posts) to the Selected Best Nodes and perused a few of them.
Today, I encountered a problem that I think one of the nodes I looked at there would answer. The problem is, I cannot remember what the node was called, nor it's author, and searching turned up too many hits for all the keywords I could think of.
So, I know I could find tilly's post with the link again, and that would give me the day and time. So then, *if* the list was being archived I could have gone to that days list and stood a much better chance of re-locating the post I was looking for.
As it is, I can't. No matter, I'll find it eventually.
| [reply] [Watch: Dir/Any] |
I don't suppose you checked your browser's history?
--
I'm not belgian but I play one on TV.
| [reply] [Watch: Dir/Any] |
Maybe he wants to ascertain how random your pseudorandom number generator is!
BTW, I recommend that PerlMonks use lava lamps for random numbers. It's just too darn cool and you can put a webcam on the thing just for kicks. Just make sure you unplug it before you leave the house / server-room. Ok, I wasn't serious :)
FYI -- I found an odd implementation using digital cameras, long exposures, and CCD Noise based on the lava-lamp work. Check it out.
| [reply] [Watch: Dir/Any] |