Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Monitor queries for new finds added to the Google index yesterday

by Scott7477 (Chaplain)
on Nov 03, 2006 at 16:56 UTC ( #582122=perlquestion: print w/replies, xml ) Need Help??

Scott7477 has asked for the wisdom of the Perl Monks concerning the following question:

I grabbed the following code from this hack that is excerpted from the book "Google Hacks."
# # Feeds queries specified in a text file to Google, querying # for recent additions to the Google index. The script appends # to CSV files, one per query, creating them if they don't exist. # usage: perl [query_filename] # My Google API developer's key. my $google_key='insert key here'; # Location of the GoogleSearch WSDL file. my $google_wdsl = "GoogleSearch.wsdl"; use strict; use SOAP::Lite; use Time::JulianDay; $ARGV[0] or die "usage: perl [query_filename]\n"; my $julian_date = int local_julian_day(time) - 2; my $google_search = SOAP::Lite->service("file:$google_wdsl"); open QUERIES, $ARGV[0] or die "Couldn't read $ARGV[0]: $!"; while (my $query = <QUERIES>) { chomp $query; warn "Searching Google for $query\n"; $query .= " daterange:$julian_date-$julian_date"; (my $outfile = $query) =~ s/\W/_/g; open (OUT, ">> $outfile.csv") or die "Couldn't open $outfile.csv: $!\n"; my $results = $google_search -> doGoogleSearch( $google_key, $query, 0, 10, "false", "", "false", "", "latin1", "latin1" ); if ($results => "") {die "The soap call failed! \n"} foreach (@{$results->{'resultElements'}}) { print OUT '"' . join('","', ( map { s!\n!!g; # drop spurious newlines s!<.+?>!!g; # drop all HTML tags s!"!""!g; # double escape " marks $_; } @$_{'title','URL','snippet'} ) ) . "\"\n"; } }

I am positive that my API key is correct, and I have the .pl file, the query file, and the WSDL file all in the same directory. When I run this code with search terms "Windows Vista", for example, the code runs as described generating a results file, except that there is nothing in the generated results csv file.

I am stumped as to how to get this to work properly; I am running AS Perl 5.8 on WinXP SP2. Any suggestions as to where I am going wrong here would be greatly appreciated.

Replies are listed 'Best First'.
Re: Monitor queries for new finds added to the Google index yesterday
by kwaping (Priest) on Nov 03, 2006 at 20:21 UTC
    After your line beginning with my $results = $google_search, I recommend adding the following:
    use Data::Dumper::Simple; print Dumper($results);
    That will be a big help in your debugging. (You may need to install Data::Dumper::Simple first.) Also, I recommend you consider using Text::CSV to create your CSV file.

    It's all fine and dandy until someone has to look at the code.
Re: Monitor queries for new finds added to the Google index yesterday
by duckyd (Hermit) on Nov 03, 2006 at 22:05 UTC
    You aren't checking to see if your call worked. You should be doing something like:
    my $result = $google_search->doGoogleSearch(...); if( $result->fault ){ die "Oops, our soap call failed: ".$som->faultstring; } # No fault, do stuff with your $result
      Good point. When I was working on this code before posting it to PM, I had changed the Soap::Lite call to include its trace functionality as follows:

      use SOAP::Lite +trace;

      In order to get that to work, I had to comment out the "use strict;" line. Doing the above showed me that I had miskeyed my Google API key. I fixed that and then removed the "+trace" from the Soap::Lite call. I've updated the code to include a line to flag failure in the Soap::Lite call along the lines of your suggestion.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://582122]
Approved by Old_Gray_Bear
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2021-06-23 18:27 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (121 votes). Check out past polls.