Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Monitor queries for new finds added to the Google index yesterday

by Scott7477 (Chaplain)
on Nov 03, 2006 at 16:56 UTC ( [id://582122]=perlquestion: print w/replies, xml ) Need Help??

Scott7477 has asked for the wisdom of the Perl Monks concerning the following question:

I grabbed the following code from this hack that is excerpted from the book "Google Hacks."
# goonow.pl # Feeds queries specified in a text file to Google, querying # for recent additions to the Google index. The script appends # to CSV files, one per query, creating them if they don't exist. # usage: perl goonow.pl [query_filename] # My Google API developer's key. my $google_key='insert key here'; # Location of the GoogleSearch WSDL file. my $google_wdsl = "GoogleSearch.wsdl"; use strict; use SOAP::Lite; use Time::JulianDay; $ARGV[0] or die "usage: perl goonow.pl [query_filename]\n"; my $julian_date = int local_julian_day(time) - 2; my $google_search = SOAP::Lite->service("file:$google_wdsl"); open QUERIES, $ARGV[0] or die "Couldn't read $ARGV[0]: $!"; while (my $query = <QUERIES>) { chomp $query; warn "Searching Google for $query\n"; $query .= " daterange:$julian_date-$julian_date"; (my $outfile = $query) =~ s/\W/_/g; open (OUT, ">> $outfile.csv") or die "Couldn't open $outfile.csv: $!\n"; my $results = $google_search -> doGoogleSearch( $google_key, $query, 0, 10, "false", "", "false", "", "latin1", "latin1" ); if ($results => "") {die "The soap call failed! \n"} foreach (@{$results->{'resultElements'}}) { print OUT '"' . join('","', ( map { s!\n!!g; # drop spurious newlines s!<.+?>!!g; # drop all HTML tags s!"!""!g; # double escape " marks $_; } @$_{'title','URL','snippet'} ) ) . "\"\n"; } }

I am positive that my API key is correct, and I have the .pl file, the query file, and the WSDL file all in the same directory. When I run this code with search terms "Windows Vista", for example, the code runs as described generating a results file, except that there is nothing in the generated results csv file.

I am stumped as to how to get this to work properly; I am running AS Perl 5.8 on WinXP SP2. Any suggestions as to where I am going wrong here would be greatly appreciated.

Replies are listed 'Best First'.
Re: Monitor queries for new finds added to the Google index yesterday
by kwaping (Priest) on Nov 03, 2006 at 20:21 UTC
    After your line beginning with my $results = $google_search, I recommend adding the following:
    use Data::Dumper::Simple; print Dumper($results);
    That will be a big help in your debugging. (You may need to install Data::Dumper::Simple first.) Also, I recommend you consider using Text::CSV to create your CSV file.

    ---
    It's all fine and dandy until someone has to look at the code.
Re: Monitor queries for new finds added to the Google index yesterday
by duckyd (Hermit) on Nov 03, 2006 at 22:05 UTC
    You aren't checking to see if your call worked. You should be doing something like:
    my $result = $google_search->doGoogleSearch(...); if( $result->fault ){ die "Oops, our soap call failed: ".$som->faultstring; } # No fault, do stuff with your $result
      Good point. When I was working on this code before posting it to PM, I had changed the Soap::Lite call to include its trace functionality as follows:

      use SOAP::Lite +trace;

      In order to get that to work, I had to comment out the "use strict;" line. Doing the above showed me that I had miskeyed my Google API key. I fixed that and then removed the "+trace" from the Soap::Lite call. I've updated the code to include a line to flag failure in the Soap::Lite call along the lines of your suggestion.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://582122]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2024-04-20 03:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found