Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Homenode Surfing

by Limbic~Region (Chancellor)
on Oct 15, 2004 at 15:59 UTC ( [id://399558]=monkdiscuss: print w/replies, xml ) Need Help??

All,
I have found some of the most interesting links, both internal and external, from visiting homenodes. The problem is that by visiting random homenodes, you are extremely likely to end up on one with little to no content.

I asked around in the CB as well as Super Searched, but didn't find anything that did exactly what I wanted. The closest was Random NonHome Nodes by blakem, which contrary to the title would allow you to surf to random homenodes with a minimun XP. The trouble is it is b0rk ATM (at least for me). Even if it were working, having XP makes no guarantee a monk has put something on their homenode. In the CB, atcroft mentioned parsing the PM Stats.

Here was my criteria for including a homenode:

  • Homenode length > 500 and XP > 200
  • XP is a measure of participation. That participation comes in many formes (logging in every day, voting, posting, etc), which tells me I am more likely to find the content I am looking for.
  • Homenode length > 500 and account created > 1 yr and last here < 45 days
  • While participation (XP) is a good indicator of quality content, not all monks are as obsessed with The Monastery as I am. Some monks have been around for a while, chosen to take the 17th seriously, but only visit occasionally.

Here is the code

#!/usr/bin/perl use strict; use warnings; use HTML::TableContentParser; use Time::Local; use WWW::Mechanize; use constant ID => 1; use constant CREATE => 3; use constant STATS => 4; use constant LAST => 4; use constant EXP => 5; use constant LENGTH => 11; # length && ( rep || create && last ) my %opt = ( length => 500, exp => 200, create => 365, last => 45, url => 'http://tinymicros.com/pm/index.php?goto=monkstats&sorto +pt=15&sortlist=15,1,3&', pos => 0, ); my $finished; my $mech = WWW::Mechanize->new( autocheck => 1 ); my @homenodes; while ( ! $finished ) { $mech->get( $opt{url} . '&start=' . $opt{pos} ); my $table = HTML::TableContentParser->new()->parse( $mech->content +() ); for my $row ( @{ $table->[ STATS ]{rows} } ) { my $length = Get_Length( $row ); next if ! defined $length; if ( $length < $opt{length} ) { $finished = 1; last; } my $id = Get_ID( $row ); push @homenodes , $id if defined $id; } $opt{pos} += 50; } sub Get_Length { my $row = shift; my $data = ${ $row->{cells} }[ LENGTH ]{data}; ($data) = $data =~ /(\d+)/ if defined $data; return $data; } sub Get_ID { my $row = shift; my ($id) = ${ $row->{cells} }[ ID ]{data} =~ /(\d+)/; my ($exp) = ${ $row->{cells} }[ EXP ]{data} =~ /(\d+)/; return $id if $exp >= $opt{exp}; my $create = Get_Days( ${ $row->{cells} }[ CREATE ]{data} ); my $last = Get_Days( ${ $row->{cells} }[ LAST ]{data} ); return $create >= $opt{create} && $last <= $opt{last} ? $id : unde +f; } sub Get_Days { my $then = shift; ($then) = $then =~ m|<NOBR>(.*)</NOBR>|; my ($yr, $mon, $day, $hr, $min, $sec) = split /[ :-]/ , $then; my $stamp = timelocal ($sec, $min, $hr, $day, --$mon, $yr); return int ( (time - $stamp) / 86_400 ); } print "<ul>\n"; print "<li>[id://$_]</li>\n" for @homenodes; print "</ul>\n";

As of this posting, there were 871 homenodes that fit this criteria. The list may be on my scratch pad depending on how long it takes me to get through them all.

Cheers - L~R

Replies are listed 'Best First'.
Re: Homenode Surfing
by tilly (Archbishop) on Oct 15, 2004 at 16:39 UTC
    Depending on how carefully you go through it, entry #7 is likely to take a while. And it is likely to become longer still...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: monkdiscuss [id://399558]
Approved by Arunbear
Front-paged by kutsu
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-24 12:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found