Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Winning an Intranet competition with the aid of WWW::Mechanize

by rinceWind (Monsignor)
on May 13, 2005 at 15:42 UTC ( #456782=CUFP: print w/replies, xml ) Need Help??

Our corporate intranet boys at my client site have been busy making over the intranet site, and relaunched it on Wednesday. They also announced a competition with 5 star images at the top of certain pages. They offered a prize for a randomly chosen correct entry.

Of course, the temptation was too much for me to resist :). I learned all about how to do authentication with NTLM from LWP::UserAgent in the process. Here goes:(password suppressed to protect the innocent)

#!perl use strict; use warnings; use WWW::Mechanize; use HTML::TokeParser; my @todo = 'http://intranet'; my %seen = (@todo, 1); my $mech = WWW::Mechanize->new( keep_alive => 1 ); $mech->credentials('intranet:80','','CORP\\williami','censored'); while (my $url = shift @todo) { print "Visiting: $url\n"; $mech->get( $url); for ($mech->links) { my $link = $_->url_abs; push @todo, $link unless exists $seen{$link}; $seen{$link}++; } my $text = $mech->content; my $tp = HTML::TokeParser->new(\$text); while (my $tag = $tp->get_tag('img')) { print "Found image: ", $tag->[1]{src}, "\n"; } }

I came back from lunch to find that I had won. I'm curious about how many other correct entries there were though.

--
I'm Not Just Another Perl Hacker

Replies are listed 'Best First'.
Re: Winning an Intranet competition with the aid of WWW::Mechanize
by merlyn (Sage) on May 13, 2005 at 16:10 UTC
    Why did you use:
    my $text = $mech->content; my $tp = HTML::TokeParser->new(\$text); while (my $tag = $tp->get_tag('img')) { print "Found image: ", $tag->[1]{src}, "\n"; }
    and not simply:
    for ($mech->images) { print "Found image: ", $_->url_abs, "\n" }
    See, I would have won. Less typing. :)

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Winning an Intranet competition with the aid of WWW::Mechanize
by mrborisguy (Hermit) on May 13, 2005 at 15:47 UTC
    CHEATER!
    just kidding... good work! at my current job, i probably would have done the same thing. ++
Re: Winning an Intranet competition with the aid of WWW::Mechanize
by muba (Priest) on May 17, 2005 at 22:57 UTC
      Because further down in the OP, inside the while @todo is reused to recurse into the links on the page...
      while (my $url = shift @todo) { print "Visiting: $url\n"; $mech->get( $url); for ($mech->links) { my $link = $_->url_abs; push @todo, $link unless exists $seen{$link}; $seen{$link}++; } ... }
      Lou
        I think muba was confused by the scalar appearing where he would expect to see an array. So was I, until I went off and tested it. Works like a charm, but I doubt I'll ever use it. (Because it causes reader confusion...)

        You learn something new every day here in the monastery. :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://456782]
Approved by bart
Front-paged by ww
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2018-07-23 10:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (462 votes). Check out past polls.

    Notices?