http://www.perlmonks.org?node_id=337515

All,
I was was looking at Fastest Rising Monks by blakem and was sort of disapointed that I was not among the elite. Then curiosity got the better of me as I wanted to know how many monks that joined after me had more XP than me. I am not overly concerned with XP, but I do log in more than what mental health professions would consider healthy. Here is what I came up with:
#!/usr/bin/perl use strict; use warnings; use CGI ':standard'; use DBI; use File::Basename 'basename'; use Getopt::Std; use LWP::Simple; use HTML::TableExtract; use File::Temp 'tempfile'; my %opt; Get_Args(); Get_Data() if $opt{u}; my @top = map { [ '', '', '', '', '', 100 ] } 0 .. $opt{t}; my @items = qw(id name xp total higher p); Get_Stats(); Print_Stats(); sub Build_DB { if ( -e $opt{d} ) { unlink $opt{d} or die "Unable to remove $opt{d}"; } $opt{file} = basename( $opt{file} ); my $dbh = DBI->connect( "dbi:SQLite:dbname=$opt{d}" ) or die $DBI: +:errstr; $dbh->do( "CREATE TABLE pm (node_id, name, xp)" ) or die $dbh- +>errstr; $dbh->do( "COPY pm FROM '$opt{file}'" ) or die $dbh- +>errstr; $dbh->disconnect; } sub Get_Args { my $Usage = qq{Usage: $0 options -h : This help message. -d : Database name -m : Maximum number of monks to check -o : Output file -p : Per page monks to check -s : Skip monks with XP less than this number -t : Top number of monks for report -u : Update database } . "\n"; getopts( 'hd:m:o:p:s:t:u' , \%opt ) or die $Usage; die $Usage if $opt{h}; $opt{d} ||= 'pmstats.db'; $opt{m} ||= 2000; $opt{p} ||= 50; $opt{s} ||= 0; $opt{t} ||= 50; $opt{t}--; $opt{u} = 1 if exists $opt{u}; } sub Get_Data { my $table = new HTML::TableExtract( headers => [ 'Rank', 'Node ID', 'Name', 'Experience' ], ); my $url = 'http://tinymicros.com/pm/index.php?goto=MonkStats&start +='; my $offset = 0; while ( $offset < $opt{m} ) { my $html = get( $url . $offset ); $table->parse( $html ); $offset += $opt{p}; } ( my $fh, $opt{file} ) = tempfile( UNLINK => 1, DIR => '.' ); for my $table_state ( $table->table_states ) { for my $row ( $table_state->rows ) { print $fh join "\t" , @{$row}[1..3]; print $fh "\n"; } } Build_DB(); } sub Get_Stats { my $dbh = DBI->connect( "dbi:SQLite:dbname=$opt{d}" ) or die $DBI: +:errstr; my $sth = $dbh->prepare("SELECT * FROM pm"); my $sth_t = $dbh->prepare("SELECT COUNT(*) FROM pm WHERE node_id > + ?"); my $sth_h = $dbh->prepare("SELECT COUNT(*) FROM pm WHERE node_id > + ? AND xp > ?"); $sth->execute() or die $dbh->errstr; while ( my @rec = $sth->fetchrow_array ) { next if ! $rec[2] || $rec[2] < $opt{s}; $sth_t->execute( $rec[0] ) or die $dbh->errstr; $sth_h->execute( $rec[0], $rec[2] ) or die $dbh->errstr; my ($total) = $sth_t->fetchrow_array; next if ! $total; my ($higher) = $sth_h->fetchrow_array; my $percent = ($higher / $total) * 100; next if $percent > $top[-1][5]; for my $id ( 0 .. $opt{t} ) { if ( $percent < $top[$id][5] ) { my @stats = ($total, $higher, $percent); splice @top, $id, 0, [ @rec, @stats]; pop @top; last; } } } $sth_t->finish(); $sth_h->finish(); $dbh->disconnect; } sub Print_Stats { if ( $opt{o} ) { open( HTML, '>', $opt{o} ) or die "Unable to open $opt{o} for +writing : $!"; select HTML; } my $url = 'http://www.perlmonks.org/index.pl?node_id='; print start_html( -title => "Fastest Rising Monks", -bgcolor => "#fff +fcc" ), div( { -align => "center" }, p(h1( "Monks XP Compared To Newer Monk's XP" ) ), p(h2( "Selected from the top $opt{m} monks" ) ), p(h3( "Skipped Monks with XP less than $opt{s}" ) ), table( { -bgcolor => "#000000", -border => "0", -cellpadding => "2", -cellspacing => "1", }, Tr( { -style => "background-color:#CCCCCC" }, th( [ qw(Rank Monk XP), '# After', '# > XP', 'Perce +nt' ] ), ), Tr( { -style => "background-color:#CCCCCC" }, [ map {td([ $_ + 1, a({ href=>$url . $top[$_][0]}, $top[$_][1] +), $top[$_][2], $top[$_][3], $top[$_][4], sprintf( "%.4f", $top[$_][5] ), ]), } 0 .. $opt{t} ] ), ), ), end_html; }
Here is an example of the output.
$ pmstats -s 100 -t 500 -m 25000 -o pmstats.html
Out of the top 25,000 monks, 8,523 joined after me. Of those, only 7 have higher XP. I will leave modifying the code to spit out which monks as an excersise for the reader (as well as any other modifications you want to make).

Cheers - L~R

Update 1: Used more descriptive column labels, fixed platform dependencies (hopefully), and fixed a bug pointed out in the CB.
Update 2: Re-ran the stats using the top 25,000 and took bart's suggestion about using the full floating point percentage for ranking.

Replies are listed 'Best First'.
Re: Fastest Rising Monks - Revisited
by woolfy (Chaplain) on Mar 18, 2004 at 08:36 UTC
    I was was looking at Fastest Rising Monks by blakem and was sort of disapointed that I was not among the elite. Then curiosity got the better of me as I wanted to know how many monks that joined after me had more XP than me.

    "Tinymicros", Saints-page, sort on "Since newest":
    http://tinymicros.com/pm/index.php?goto=list&sortopt=5&sortlist=1,3&monktype=saint

    No need for a script. Just count them... there are 7 who joined after you did and who have more XP: davido, liz, pg, hardburn, diolatevi, adrianh and sauoq. And Roger is hot on your t(r)ail... :-)

    Re: L~R: of course, just for fun, programming things like this sure can be fun, but you certainly put in quite an effort... and maybe it is a big load for the server (OK, no extra load, wonderful, well done). Well, it kept you off the street!

      woolfy,
      Sure - but I followed the theory that if you are thinking it, chances are someone else is too. Additionally, just counting the number of monks after me with more XP (7) wouldn't give me a good idea by itself, it takes knowing how many monks total came after me (8,523) to give it perspective. Besides, it is just for fun anyway.

      Cheers - L~R

      Update: After reading woolfy's update, I feel I need to comment further. There was 0 load added to PerlMonks by this program as the code actually queries an external server. The effort I put in will likely be hard for some people to understand since I have already abandoned the project. I absolutely despise database programming and hate CGI/HTML/HTTP/etc coding even more. I force myself to do "fun" projects with them to improve my skills so when/if the time comes that I need to - I am not starting from scratch. In other words, this seemed like a fun project to torture myself with.

        Additionally, just counting the number of monks after me with more XP (7)
        Slacker! You're not working hard enough. There are seven (7!) monks starting later than you that already earned more.
        it takes knowing how many monks total came after me (8,523)
        Excuses, execuses, excuses! Keep working, there are 8,516 monks trailing you, eager to overtake!

        Abigail, at least 7223 points away from anyone starting after me.

        PS: Don't take my post seriously.

Re: Fastest Rising Monks - Revisited
by CountZero (Bishop) on Mar 18, 2004 at 06:17 UTC
    Interesting, but perhaps the "Percent" is not saying so much about the Monks' "quality" (no, I will not go into any discussion about XP and quality here). It would say a lot more if one also took into account the level (Saint,...) as now a "new" initiate will have a low percentage (no one who came in after him will have a higher XP yet) and will thus rank very high (within the first twenty).

    Perhaps sorting on "level" first and "percent" next?

    Still, a useful exercise: ++!

    Update: I see you now exclude Monks with less 100 XP. That indeed already answers my comment!

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      CountZero,
      Well, you raise a point I should be more clear about. I am not excluding monks with less than 100 XP in all calculations. I am just disqualifying them from appearing in the list.

      In other words, they still count for calculating monks that came after you, but they themselves are not eligible to be in the list. This however is one of the many configurable options. Once you have built the database with the -u option, you can re-run your stats using different variations.

      Cheers - L~R
Re: Fastest Rising Monks - Revisited
by Abigail-II (Bishop) on Mar 17, 2004 at 22:10 UTC
    Well, I see a table, but it's not quite clear to me what the columns mean. The first column indicates a rank, but where is it sorted on? The 'monk' and 'XP' are clear, but what does 'Total' stand for? Is 'higher' a column that indicates the number of monks that have joined later than the column of that row, but with more XP? And what does the Percent column mean?

    Abigail

      Abigail,
      Sorry - the columns made sense to me when I was coding it.
      • The total column is the number of monks that joined after you.
      • The higher column is the number of those monks with more XP.
      • The percent column is the (higher / total) * 100
      • Sorting is by percentage first, then by XP.

      Cheers - L~R

Re: Fastest Rising Monks - Revisited
by bart (Canon) on Mar 18, 2004 at 08:53 UTC
    You must have sorted on the rounded off percentage, because I'm surprised to see Juerd ranks higher than diotalevi, while the latter appears to have a better score, to me:
    (2 out of 4713 ~~ 4.246E-3) > (1 out of 2839 ~~ 3.522E-3)
      bart,
      You are correct. As I said, anyone is welcome to modify it since close percentages at 2 decimal places is probably not enough. I will modify it so that the actual floating point is used to determine order but the display is limited to 4 places.

      Cheers - L~R

Re: Fastest Rising Monks - Revisited
by halley (Prior) on Mar 18, 2004 at 14:40 UTC
    I was on blakem's "fastest rising" list for most of my ascent, but dropped off it quickly just about exactly at the moment I hit level 10. Real work intrudes.

    One statistic I would far rather see is a ranking of Monks by the total reps of all of the Monk's nodes/writeups, completely disregarding the rep of the Monk. (A side query could split the ranks by Total Rep, Total +Rep disregarding --votes, Total -Rep disregarding ++votes.) Does anyone already have one of these specialized queries available?

    --
    [ e d @ h a l l e y . c c ]

      halley,
      Does anyone already have one of these specialized queries available?

      Not that I know of, but it would be trivial to do so. I "stole" blakem's table parsing code and stuck it into a SQLite database. You would only need to add the write ups column to my code and build your SQL queries accordingly. After 24 hours, this project isn't fun anymore so I will leave implementation up to you.

      Cheers - L~R
Re: Fastest Rising Monks - Revisited
by flyingmoose (Priest) on Mar 18, 2004 at 14:59 UTC
    Can you post some of the output here to show what this does (say just the top 100)? Not all of us have an environment in front of us to watch this voodoo to kick out the tables...

    Anyhow, I think the most telling stat would be the XP/post-count ratio. Mine is probably really low, since I prefer to just make Monty Python jokes and references to llamas, and other such stuff. Did I ever tell you a moose once bit my sister...

    halley's request to see the +/- numbers seperated out is interesting too, but maybe we really don't want to know...

        mooses can't read. everybody knows that. Thanks. (and #13! \/\/00+!)
Re: Fastest Rising Monks - Revisited
by pboin (Deacon) on Mar 19, 2004 at 21:20 UTC

    Umm... I feel like I'm a little stupid on this one.. I'm getting the following:

    Prototype mismatch: sub main::head vs ($) at pmstats line 8

    So, I did a little research, and found a cpan doc on just this problem (a namespace issue w/ CGI).

    So, am I the only one that's getting this, or did everyone else know how to fix it right away and not say anything? I'm kinda off-balance, 'cause I know L~R's stuff is good and works.

    Wondering...

      pboin,
      Thanks for the vote of confidence in my abilities, though I would run far far away from any code I wrote ;-)

      The warning can safely be ignored as the code still functions. I eventually found the cause the long way, but didn't have time to find/test a fix. Is the code not working for you or were you just concerned about the warning? If you want to play around, the document you referenced says to do:

      #use LWP::Simple; use LWP::Simple '!head';
      Cheers - L~R
Re: Fastest Rising Monks - Revisited
by Grygonos (Chaplain) on Mar 18, 2004 at 18:48 UTC
    One thing that this cannot take into account is the following.... my first SoPW and register date was on 2002-09-23 15:08:07 and the next on 2003-06-05 09:50:18 I have been active since then... so I have almost a year of time in which I did nothing on pm... just proving once more that xp is an illusion . especially if you're using date joined

    Grygonos
      Grygonos,
      One thing that this cannot take into account is the following....

      There are a lot of things this can't take into account. That is why it states what it is basing the rank off. Number of monks that joined after you that have more XP. The process to determine monks that joined after you is potentially flawed. It does not use a date field but rather the node ID of the user - higher node means came after. If the current way of assigning node IDs changes - this code goes out the window.

      It was only for fun afterall - L~R