http://www.perlmonks.org?node_id=124669

This small script generates a HTML page with all of my favorite online comics. Since it can takes some times to surf from one site to another each day, I whipped up this script to do it for me each morning, so that I can enjoy my breakfast without having to frantically visit each individual page.

update: added a list of the comics for quick access and changed a bit the %cartoons hash.

#!/usr/bin/perl -w use strict; use LWP::Simple qw(get); use HTML::Parser; use URI; use CGI qw(:html); my $file = 'daily_comic.html'; my $wday = (localtime)[6]; my %cartoons = ( 'The New Bobbins Show' => { url => 'http://www.bobbins.org',# URL of the page with the pi +cture src => qr/\d{8}\.png$/, # Regex to find the image on +the page days => [ 0, 1, 2, 3, 4, 5, ], # When is the comic published }, 'Diesel Sweeties' => { url => 'http://www.dieselsweeties.com', src => qr/sw\d+.png$/, days => [ 0, 1, 2, 3, 4, 5, ], }, 'RPG World' => { url => 'http://www.rpgworldcomic.com', src => qr/\d{8}\w?\.jpg$/, days => [ 0, 3, 5, ], }, 'Gene Catlow' => { url => 'http://www.genecatlow.com', src => qr/\d{8}\.gif$/, days => [ 0, 1, 3, 5, ], }, 'User Friendly' => { url => 'http://www.userfriendly.org/static', src => qr/uf\d+\.gif$/, days => [ 0, 1, 2, 3, 4, 5, 6 ], }, 'Goats' => { url => 'http://www.goats.com', src => qr/goats\d+.gif$/, days => [ 0, 1, 3, 5, ], }, 'Penny Arcade' => { url => 'http://www.penny-arcade.com/view.php3', src => qr/\d{8}\w?.gif$/, days => [ 0, 1, 3, 5, ], }, 'Angst Technologies' => { url => 'http://www.inktank.com/AT/index.cfm', src => qr/\d\d-\d\d-\d\d.gif$/, days => [ 0, 1, 2, 3, 4, 5, ], }, 'Indy Rock Pete' => { url => 'http://www.indierockpete.com/', src => qr/p\d+f\.gif$/, days => [ 0, 1, 2, 3, 4, 5, ], }, 'Sinfest' => { url => 'http://www.sinfest.net/', src => qr/sf\d{8}.gif$/, days => [ 0, 1, 2, 3, 4, 5, 6 ], }, 'Dilbert' => { url => 'http://www.dilbert.com/', src => qr/dilbert\d+.gif$/, days => [ 0, 1, 2, 3, 4, 5, 6 ], }, ); # Write the HTML page open( COMIC, "> $file" ) or die "Cannot create the file '$file': $!\n" +; select COMIC; print start_html( -head => meta( { -http_equiv => 'Content-Type', -content => 'text/html; charset=utf-8' } ), -title => "Briac's Daily Cartoon Delivery", ), h1("Briac's Daily Cartoon Delivery"), ul( li( [ map { a({-href=>"#$_"},$_) } sort keys %cartoons ] ) ); # Grab the different pictures foreach my $site ( sort keys %cartoons ) { # Get only the comics if it is published this day. next unless grep { $_ == $wday } @{ $cartoons{$site}->{'days'} }; # Fetch the page with LWP::Simple my $page = get($cartoons{$site}->{'url'}) or warn "Could not get '$site'\n"; # HTML Parser fetching the image matching the pattern defined # in the %cartoons hash my $parser = HTML::Parser->new( start_h => [ sub { my $attr = shift; return if $attr->{'src'} !~ $cartoons{$site}->{'src'}; $cartoons{$site}->{'img'} = $attr->{'src'}; }, "attr" ], report_tags => qw(img), ); $parser->parse($page); $parser->eof(); # Print the comic picture in the HTML page # Check to see if the URI of the picture is relative or not my $uri = URI->new( $cartoons{$site}->{'img'} ); my $src = $uri->scheme() ? $cartoons{$site}->{'img'} : $uri->abs( $cartoons{$site}->{'url'}); print a({-name =>$site}), h2( a( { -href => $cartoons{$site}->{'url'} }, $site ) ), img( { -src => $src, -alt => "$site comic" } ); } print end_html();
<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>

Replies are listed 'Best First'.
Re: Automatic Daily Cartoon Delivery
by jynx (Priest) on Nov 12, 2001 at 02:41 UTC

    This is familiar,

    If you check super search you'll come up with a node that did this same thing. Since it's worth repeating, i'll make the same comment as before: don't infringe upon copyright laws. i made a script a while ago that did the same thing, and quickly got some messages from the authors of the comics i was downloading that i either stop doing it or face a lawsuit.

    You have good code, but it is in your best interest to not use it.

    jynx

      IANAL, but it seems to me that as long as it's for your personal use only and you don't share the web page thus created with a public audience, it's harmless or at the very least could easily be argued as fair use. You certainly wouldn't want to make the generated pages public though...
      "Non sequitur. Your facts are un-coordinated." - Nomad
        Having dealt with a number of cartoonists, I'd have to agree with Clownburner...most cartoonists don't particularly care if you create something for personal use, but it is incredibly difficult these days to offer cool content without being killed by your own popularity.

        Basically, if you put up something cool on the web and lots of people come see it, then you've got to spend money on bandwidth...in order to cover some of that cost and maybe expand what you are doing, you allow some advertising on the pages that people are visiting. If you start having a whole bunch of people using a grabber that gets rid of the ads on the cool stuff you find on the web, then shortly the cycle reverses...the bandwidth isn't covered by the advertising, the cartoonist can't cover his/her costs and the strip shuts down. You, the fan, lose out, because you weren't getting the advertising on the site.

        It's one of the basic flaws of the internet right now...and one of the reasons that micropayments and subscriptions are going to come about sooner or later - nothing in life remains free.

        Kickstart

Re: Automatic Daily Cartoon Delivery
by argus (Acolyte) on Nov 16, 2001 at 00:21 UTC
    Great code. There is a similar script that grabs over 200 cartoons of your choosing and is easily modifiable. dailystrips by Andrew Medico. I have a cron job that runs it daily (it will even archive cartoons) that sits on a web server that only I can get to. He does have the disclaimer that you should not make these cartoon publically accessable.