Category: |
PerlMonks Related Scripts |
Author/Contact Info |
crashtest |
Description: |
Ever wondered if you've seen every single quip that displays on the top of the pages on this site? I was curious, so I put together a little screen-scraper to check. Hey, it was a slow Sunday afternoon!
No command-line arguments for this script, but there are three variables at the top of the code (clearly marked) that provide some crude configuration. $iterations determines how often perlmonks.org is queried, $nice sets how long to pause between HTTP requests, while $status_print_interval configures how often the script pipes up with a quick status report to show it's still alive.
After 100 iterations, I think I've "collected 'em all", but I look forward to running this script every couple of months to see what other witticisms those [id://pmdev]s come up with:
********** RESULTS **********
2 time(s): Perl Sensitive Sunglasses
7 time(s): more useful options
9 time(s): Pathologically Eclectic Rubbish Lister
3 time(s): XP is just a number
8 time(s): Welcome to the Monastery
5 time(s): Think about Loose Coupling
3 time(s): P is for Practical
4 time(s): Syntactic Confectionary Delight
4 time(s): Perl Monk, Perl Meditation.
7 time(s): Your skill will accomplish what the force of many cannot
8 time(s): "be consistent."
5 time(s): go ahead... be a heretic
8 time(s): Keep It Simple, Stupid
6 time(s): laziness, impatience, and hubris
4 time(s): Perl: the Markov chain saw
10 time(s): There's more than one way to do things.
3 time(s): Just another Perl shrine
4 time(s): good chemistry is complicated,and a little bit messy-LW
|
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
#####################################################################
### Set how many times to run, how long (in seconds) to wait between
### page hits, and how often to print an interim status report.
my ($iterations, $nice, $status_print_interval) = (100, 5, 5);
#####################################################################
my %quips;
$SIG{INT} = sub{ print_results(\%quips); exit(0) };
my $loopcount = 0;
print "Beginning PerlMonks Quip Gatherer...\n";
while($iterations > 0){
my $content = get('http://www.perlmonks.org');
die "Failed to load content!\n" unless defined($content);
extract_quip(\%quips, \$content);
$iterations--;
$loopcount++;
print "Found ", scalar keys %quips,
" quip(s) so far, $iterations iteration(s) left...\n"
if ($loopcount % $status_print_interval == 0);
sleep($nice) if ($iterations);
}
print_results(\%quips);
#####################################################################
### SUBS ###
#####################################################################
sub extract_quip{
my ($quips, $content) = @_;
if ($$content =~ m!<td class="monkquip"[^>]+>(.*?)</td>!s){
my $data = $1;
$data =~ s!\<[^>]*>!!sg;
$data =~ s!\s{2,}!!sg;
$quips->{$data}++;
}
}
sub print_results{
my $quips = shift;
print "\n********** RESULTS **********\n";
while (my ($key, $val) = each(%$quips)){
print "$val time(s): $key\n";
}
}
|
Re: PerlMonks Quips Gatherer
by davido (Cardinal) on May 09, 2005 at 06:48 UTC
|
I love it. Why ask a god or pmdev when you can ask Perl to do your research for you, and learn a little in the process? :) Good work!
| [reply] |
Re: PerlMonks Quips Gatherer
by cog (Parson) on May 09, 2005 at 09:17 UTC
|
| [reply] |
|
Argh! False impatience strikes again! I did do a super-search for "quips" and "quotes" but for some reason (unknown now) excluded the [id://Meditations] section. But at least I still have a general-purpose solution for the future...
| [reply] |
|
Yes, you do, and perhaps you can change it to gather the quotes on the CB when no-one's on it O:-)
| [reply] |
|
|
|
Re: PerlMonks Quips Gatherer
by ysth (Canon) on May 08, 2005 at 23:53 UTC
|
Congratulations, you found them all (though you might have drawn someone's attention to adding to them...) | [reply] |
|
|