Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Homenode Surfing

by Limbic~Region (Chancellor)
on Oct 15, 2004 at 15:59 UTC ( #399558=monkdiscuss: print w/replies, xml ) Need Help??

I have found some of the most interesting links, both internal and external, from visiting homenodes. The problem is that by visiting random homenodes, you are extremely likely to end up on one with little to no content.

I asked around in the CB as well as Super Searched, but didn't find anything that did exactly what I wanted. The closest was Random NonHome Nodes by blakem, which contrary to the title would allow you to surf to random homenodes with a minimun XP. The trouble is it is b0rk ATM (at least for me). Even if it were working, having XP makes no guarantee a monk has put something on their homenode. In the CB, atcroft mentioned parsing the PM Stats.

Here was my criteria for including a homenode:

  • Homenode length > 500 and XP > 200
  • XP is a measure of participation. That participation comes in many formes (logging in every day, voting, posting, etc), which tells me I am more likely to find the content I am looking for.
  • Homenode length > 500 and account created > 1 yr and last here < 45 days
  • While participation (XP) is a good indicator of quality content, not all monks are as obsessed with The Monastery as I am. Some monks have been around for a while, chosen to take the 17th seriously, but only visit occasionally.

Here is the code

#!/usr/bin/perl use strict; use warnings; use HTML::TableContentParser; use Time::Local; use WWW::Mechanize; use constant ID => 1; use constant CREATE => 3; use constant STATS => 4; use constant LAST => 4; use constant EXP => 5; use constant LENGTH => 11; # length && ( rep || create && last ) my %opt = ( length => 500, exp => 200, create => 365, last => 45, url => ' +pt=15&sortlist=15,1,3&', pos => 0, ); my $finished; my $mech = WWW::Mechanize->new( autocheck => 1 ); my @homenodes; while ( ! $finished ) { $mech->get( $opt{url} . '&start=' . $opt{pos} ); my $table = HTML::TableContentParser->new()->parse( $mech->content +() ); for my $row ( @{ $table->[ STATS ]{rows} } ) { my $length = Get_Length( $row ); next if ! defined $length; if ( $length < $opt{length} ) { $finished = 1; last; } my $id = Get_ID( $row ); push @homenodes , $id if defined $id; } $opt{pos} += 50; } sub Get_Length { my $row = shift; my $data = ${ $row->{cells} }[ LENGTH ]{data}; ($data) = $data =~ /(\d+)/ if defined $data; return $data; } sub Get_ID { my $row = shift; my ($id) = ${ $row->{cells} }[ ID ]{data} =~ /(\d+)/; my ($exp) = ${ $row->{cells} }[ EXP ]{data} =~ /(\d+)/; return $id if $exp >= $opt{exp}; my $create = Get_Days( ${ $row->{cells} }[ CREATE ]{data} ); my $last = Get_Days( ${ $row->{cells} }[ LAST ]{data} ); return $create >= $opt{create} && $last <= $opt{last} ? $id : unde +f; } sub Get_Days { my $then = shift; ($then) = $then =~ m|<NOBR>(.*)</NOBR>|; my ($yr, $mon, $day, $hr, $min, $sec) = split /[ :-]/ , $then; my $stamp = timelocal ($sec, $min, $hr, $day, --$mon, $yr); return int ( (time - $stamp) / 86_400 ); } print "<ul>\n"; print "<li>[id://$_]</li>\n" for @homenodes; print "</ul>\n";

As of this posting, there were 871 homenodes that fit this criteria. The list may be on my scratch pad depending on how long it takes me to get through them all.

Cheers - L~R

Replies are listed 'Best First'.
Re: Homenode Surfing
by tilly (Archbishop) on Oct 15, 2004 at 16:39 UTC
    Depending on how carefully you go through it, entry #7 is likely to take a while. And it is likely to become longer still...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://399558]
Approved by Arunbear
Front-paged by kutsu
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (9)
As of 2017-02-22 17:59 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (334 votes). Check out past polls.