http://www.perlmonks.org?node_id=240972

Sean M Burke has a nice little program in his book Perl & LWP which uses AltaVista to check the popularity of words, which I modified slightly to use Google instead. Here it is:
#!/usr/bin/perl use warnings; use strict; # goolies: check popularity of words on Google. modified from: # Example code from Chapter 2 of /Perl and LWP/ by Sean M. Burke # http://www.oreilly.com/catalog/perllwp/ # sburke@cpan.org use LWP; use URI::Escape; die "Usage: goolies word1 word2\n" unless @ARGV; foreach my $word (@ARGV) { next unless length $word; my $url = 'http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q= +' . uri_escape($word) . '&btnG=Google+Search'; my ($content, $status, $is_success) = do_GET($url); if (!$is_success) { print "Sorry, failed: $status\n"; } elsif ($content =~ m#of about <b>(\d+,?\d*,?\d*)</b>#) { print "$word: $1 matches\n"; } else { print "$word: couldn't extract count due to some weird shit at + $url\n"; } sleep 2; # be nice to Google's servers!!! } ##### subs ##### sub do_GET { my $browser = LWP::UserAgent->new(); $browser->agent('Mozilla/4.76 [en] (Win98; U)'); # tricked you! my $response = $browser->get(@_); return ($response->content, $response->status_line, $response->is_ +success, $response) if wantarray; return unless $response->is_success; return $response->content; }

Replies are listed 'Best First'.
Re: check words popularity using Google
by jasonk (Parson) on Mar 06, 2003 at 22:44 UTC

    You could make this much friendlier to Google's servers by using the SOAP interface, rather than loading the whole page just to get one number. There are even some perl interfaces to their SOAP system: Net::Google and DBD::google (DBD::google is a DBI wrapper around Net::Google, which makes Google act like a database, and you can do sql queries against it, it is seriously cool).

      Interesting, thanks for the suggestion. I haven't yet checked out the Google modules on CPAN, although I really must get round to doing it rather than just screen-scraping, as I do use Google pretty extensively in my work (journalist). In fact O'Reilly has a new book called Google Hacks which is coming out soon, and I believe quite a bit of the code in it will be in Perl.

      OK, here's a quick rewrite using SOAP::Lite instead of screen-scraping. Now I'm off to install Net::Google and DBD::google to see what they can offer!

      #!/usr/bin/perl use warnings; use strict; use SOAP::Lite; my $key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'; die "Usage: goolies2 word1 word2\n" unless @ARGV; foreach my $query (@ARGV) { my $googleSearch = SOAP::Lite->service("file:GoogleSearch.wsdl"); my $result = $googleSearch->doGoogleSearch($key, $query, 0, 10, "f +alse", "", "false", "", "latin1", latin1"); print "$query returned about $result->{'estimatedTotalResultsCount +'} results.\n"; } __END__ sample output: heather nova returned about 136000 results. sheryl crow returned about 238000 results. avril lavigne returned about 463000 results.