Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Perl and Google API

by doonyakka (Beadle)
on May 18, 2002 at 00:36 UTC ( #167447=perlquestion: print w/ replies, xml ) Need Help??
doonyakka has asked for the wisdom of the Perl Monks concerning the following question:

Hi all. I'm new to this and I've probably made some elementary error, so please bear with me :)

I want to read a wordlist from a text file and pass each word as a query to Google API. If the query returns more than 100 hits, then I want to write that word to a new file.

The relevant code snippet is:
foreach (<FH>) { chomp; my $googleSearch = SOAP::Lite -> service("file:GoogleSearch.wsdl") +; my $result = $googleSearch -> doGoogleSearch($key, $_, 0, 10, "fal +se", "", "false", "lang_es", "latin1", "latin1"); my $hitCount = qq{$result->{'estimatedTotalResultsCount'}}; print "$_ returns about $hitCount hits.\n"; if ($hitCount > 200) { print OUTFILE "$_ returns about $hitCount hits.\n"; } }
The problem I'm having is that the first word is queried and I get a $hitCount, but after that it's "500 Internal Server Error". Any idea what's causing this and how to solve it?

Thanks to all the Monks.

Cheers,
doonyakka

Comment on Perl and Google API
Download Code
Re: Perl and Google API
by mrbbking (Hermit) on May 18, 2002 at 01:22 UTC
    Ideas:
    • Do they have any kind of a throttle -- that is, allowing only a certain number of requests per time period? I know there's 1,000 per day, but maybe there's something at a lower level. Some sort of delay to prevent accidental overloads, maybe.
    • Is there something strange about the second word in your list? Are you sure that it is what you think it is? Maybe throw in a print statement to verify.
    • What happens if you delete the first word from your list and start the process with the second word? Does the first connection still succeed and the second still fail?
    (The HTTP 500 is from Google, right?)(and not your program).

    update: Clarified question about the HTTP 500.

Re: Perl and Google API
by Anonymous Monk on May 18, 2002 at 03:12 UTC
    Is it just me, or is $hitCount a funny name for a variable?
      Well, it is descriptive. Perhaps $matchCount would be better though.

      Who says that programmers can't work in the Marketing Department?
      Or is that who says that Marketing people can't program?
Re: Perl and Google API
by Mr. Muskrat (Abbot) on May 18, 2002 at 03:21 UTC
    Well, you haven't shown us your code that pertains to CGI. So we can't possibly tell you why you are getting a 500 error.
    Who says that programmers can't work in the Marketing Department?
    Or is that who says that Marketing people can't program?
Re: Perl and Google API
by doonyakka (Beadle) on May 18, 2002 at 11:34 UTC
    Thanks for your replies so far.

    To mrbbking:

    > Do they have any kind of throttle?

    The Google API FAQ doesn't mention one, only the 1,000 max queries. I've tried sleeping for 5-10 secs, with no luck.

    > Is there something strange about the second word in your list? What happens if you delete the first word from your list and start the process with the second word? Does the first connection still succeed and the second still fail?

    It looks like this may be the problem. I took out the first word and immediately got the 500 Error. I've tried messing around with the input encoding parameter, as the second word has got an '' in it, but nothing works, even UTF-8.

    > The 500 is from Google, right?

    Yes.

    To Mr. Muskrat:

    > Well, you haven't shown us your code that pertains to CGI. So we can't possibly tell you why you're getting a 500 error.

    I don't have any CGI code. Here's the code in full (with the Google API key ($key) changed to 0's).
    #!/usr/bin/perl use SOAP::Lite; use strict; my $key='00000000000000000000000000'; my $fh = "wordList.txt"; open FH, "$fh" or die "Can't open: $!"; open OUTFILE, ">goodWords.txt"; foreach (<FH>) { chomp; print $_; my $googleSearch = SOAP::Lite -> service("file:GoogleSearch.wsdl") +; my $result = $googleSearch -> doGoogleSearch($key, $_, 0, 10, "fal +se", "", "false", "lang_es", "latin1", "latin1"); my $hitCount = qq{$result->{'estimatedTotalResultsCount'}}; print " returns about $hitCount hits.\n"; if ($hitCount > 200) { print OUTFILE "$_ returns about $hitCount hits.\n"; } } close OUTFILE; close FH;
    That's all.

    BTW, what's weird about $hitCount, and why is $matchCount better? I'm a total newbie so don't know about these things. :)

    As ever, thanks to all the Monks (especially those who've helped me out here).

    doonyakka


    Update: this particular problem solved, with the very kind help of dree. The problem was not with the code, but with Soap::Lite v0.52. dree had Soap::Lite v0.46 and had no problems running the program, and when he upgraded to v0.52 it stopped working (ie., he got the dreaded 500 error). So, I installed 0.46 and hey presto! it's working fine now. Thanks dree!

      BTW, what's weird about $hitCount, and why is $matchCount better? I'm a total newbie so don't know about these things. :)
      The "$" looks like an "S". Do the switch, and see if it makes a difference in perception.... ;-)
        hehe...of course...that was part of the attraction, my first perl pun ;)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://167447]
Approved by trs80
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2014-07-26 03:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls