Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Google local search regex

by danambroseUK (Beadle)
on Sep 30, 2005 at 13:06 UTC ( #496403=perlquestion: print w/ replies, xml ) Need Help??
danambroseUK has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I would like to query and use the data returned from a google local search... For those who dont know the google local search this can be found here...

http://www.google.co.uk/local?hl=en&lr=&q=pizza&near=Colchester+CO3+9FA&sa=X&oi=localr

Using Lynx, I was able to get a text dump of the page
[1]Google Local _______________________ _________________________________ Search What Where [_] Remember this location Local Search within: [2]1 mile - [3]5 miles - 15 miles - [4]45 miles Show only: [5]Pizza Delivery & Takeaway - [6]Takeaway Food A. [7]1st & Always Sam's Pizzeria 01206 575555 5 Crouch St Colchester, CO3 3EN 1.8 mi NE - [8]Directions References: [9]the-business-search.co.uk - [10]2 more B. [11]Pizza Italia Plus 01206 545500 34a Mersea Rd Colchester, CO2 7ES 1.9 mi E - [12]Directions C. [13]Pizza Express plc 01206 760680 1 St. Runwald Street Colchester, CO1 1AG 1.9 mi NE - [14]Directions References: [15]bbc.co.uk - [16]2 more D. [17]Perfect Pizza 01206 546444 1 Middleborough Colchester, CO1 1QS 2.0 mi NE - [18]Directions References: [19]perfectpizza.co.uk E. [20]Pizza Hut (UK) Ltd 01206 574478 48-50 High St Colchester, CO1 1DH 2.0 mi NE - [21]Directions . . . etc . . .
What I would like is a regex to extarct the Name,Telephone,Address and distance of each record returned



Any help would be really appricated.

Dan

Comment on Google local search regex
Download Code
Re: Google local search regex
by marto (Bishop) on Sep 30, 2005 at 13:20 UTC
    Hi danambroseUK,

    If you have not done so already you may want to read the Google terms of service.
    "You may not take the results from a Google search and reformat and display them"

    Martin
      I didnt read that no!

      Hypothetically speaking, how would one achieve the required result? :)

      Dan
        One would sign up for an API Key (after reading and agreeing to the Ts & Cs):

        "To access the Google Web APIs service, you must create a Google Account and obtain a license key. Your Google Account and license key entitle you to 1,000 automated queries per day."

        Then take a look at Cpan results for Google , such as WWW::Search::Google and start from there perhaps.

        Martin
Re: Google local search regex
by spatterson (Monk) on Sep 30, 2005 at 13:47 UTC
Re: Google local search regex
by puploki (Hermit) on Sep 30, 2005 at 13:48 UTC

    As marto has already said, you can't parse their web page directly, however, you can sign up for an API key which allows you to make around 1000 queries a day (I think) for you own custom applications. It works with SOAP::Lite apparently.

      I have investigated down this route, but the google local search is not part of the google search API, thus me trying to get at the data this way...

      Dan
Re: Google local search regex
by Cody Pendant (Prior) on Sep 30, 2005 at 23:53 UTC
    All legal/moral questions aside, the quickest way would be to download the source, by getting the page with LWP::UserAgent and grabbing the right bits.

    A simple regex m|<nobr>([\d ]+)</nobr> will do that for the phone numbers for instance.

    But Google will send back a 403 unless you tell LWP::UserAgent to pretend to be a regular browser. Which, exercise for the reader and of course, brings us back to the legal/moral thing.



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://496403]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-12-20 15:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (96 votes), past polls