Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

how do I extract contact data from websites?

by Meisamhe
on Jul 13, 2002 at 18:22 UTC ( #181511=perlquestion: print w/replies, xml ) Need Help??

Meisamhe has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.
  • Comment on how do I extract contact data from websites?

Replies are listed 'Best First'.
Re: how do I extract contact data from websites?
by DamnDirtyApe (Curate) on Jul 13, 2002 at 20:05 UTC

    It sounds like you want Net::Whois. Here's an example that retrieves the contacts listed with the WHOIS server.

    #! /usr/bin/perl use strict ; use warnings ; use Net::Whois ; my $domain = shift @ARGV or die "I need a domain to check!" ; my $w = new Net::Whois::Domain $domain or die "Can't connect to Whois server\n" ; unless ($w->ok) { die "No match for $domain" } if ( my $contact_hashref = $w->contacts ) { foreach my $contact ( keys %$contact_hashref ) { print join "\n\t", $contact, @{$contact_hashref->{$contact}} ; print "\n\n" ; } }

    And here's the program in action:

    $ perl test.pl example.com ADMINISTRATIVE Internet Assigned Numbers Authority (IANA) iana@I +ANA.ORG 4676 Admiralty Way, Suite 330 Marina del Rey, CA 90292 US 310-823-9358 Fax- 310-823-8649 TECHNICAL Internet Assigned Numbers Authority (IANA) iana@I +ANA.ORG 4676 Admiralty Way, Suite 330 Marina del Rey, CA 90292 US 310-823-9358 Fax- 310-823-8649

    Hope that helps.


    _______________
    D a m n D i r t y A p e
    Home Node | Email
Re: how do I extract contact data from websites?
by beretboy (Chaplain) on Jul 13, 2002 at 18:51 UTC
    Hello Meisamhe, and let me be the first to welcome you to the monestary. Anyway, could you please provide more information about what websites you are trying to extract information from and what if any code you have so far.

    "Sanity is the playground of the unimaginative" -Unknown
Re: how do I extract contact data from websites?
by schumi (Hermit) on Jul 13, 2002 at 19:58 UTC
    <mumble>'S past half nine in the evening an' 'e talks about morning... argh</mumble>

    erm...

    Hi there, and welcome to the Monastery!
    You don't really say exactly what kind of data you are looking for. So while we are waiting for you to go into further details, I'll just quickly point out two directions in which you might go:

    • You have a website and want to know data about your visitors. While you could just look this up in your logfiles, you could also look at CGI which helps you to extract things like referer and username if they have to login etc. It seems to me that this is not what you want, though.
    • You want to visit a website and get the contact details from the website's admins. For this you might want to look at Net::whois.

    Perhaps this isn't at all what you want. In that case forgive me for taking your time and pointing you into the wrong directions.
    :-)

    --cs

    There are nights when the wolves are silent and only the moon howls. - George Carlin

Re: how do I extract contact data from websites?
by michellem (Friar) on Jul 13, 2002 at 19:49 UTC
    Hi Meisamhe,

    Welcome!

    I'm going to be presumptuous, and assume that you mean one of two things: having a form on your web site, which has contact information that you want to then get locally, or having a spider-like thing that extracts specific kinds of data (addresses, phone numbers, email) from many websites. If you are talking about the latter, I can't help too much, because I haven't done that sort of thing (although I could pretty easily figure out how - but I'm sure many monks around here would be better).

    If it's the former:
    What you need in that instance, is a CGI script (there are a few out there, but they aren't hard to write either - that's how I started on the perl journey) that takes data from the form, and, say, either drops it into a file (maybe delimited) or emails it to you? Is that what you have in mind? With the CGI module, it's a very easy sort of script to write, and I could certainly post mine, if you are interested.

      If you know out how to extract addresses, phone numbers and emails from websites, then please don't tell people who ask. Unless you think that the world needs more spam.
Re: how do I extract contact data from websites?
by Beatnik (Parson) on Jul 14, 2002 at 09:45 UTC
    I'm going out on a limb here and assume you don't want that Whois info everybody is talking aboot. Best way to go about it then is to use any of the HTML::Parser based modules (HTML::TokeParser, HTML::SimpleParse, HTML::Tree, etc) together with LWP::Simple/LWP::UserAgent. If you provide us with more, solid details on what you wanna do, I'm sure some of us will cook up something nice for you.

    Greetz
    Beatnik
    ...Perl is like sex: if you're doing it wrong, there's no fun to it.
    A reply falls below the community's threshold of quality. You may see it by logging in.
A reply falls below the community's threshold of quality. You may see it by logging in.
A reply falls below the community's threshold of quality. You may see it by logging in.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://181511]
Approved by wil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2021-03-07 16:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favorite kind of desktop background is:











    Results (122 votes). Check out past polls.

    Notices?