Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Web scraping

by alexgrimmy (Initiate)
on May 08, 2015 at 01:58 UTC ( #1126047=perlquestion: print w/replies, xml ) Need Help??

alexgrimmy has asked for the wisdom of the Perl Monks concerning the following question:

I didn't find easy to export my contact information for LinkedIn so I started to look at Mojo::UserAgent for ways to "scrape" my contact info off LinkedIn. To put it mildy, I'm failing miserably. I can get the transactor to pull the default page, but I tried to post my log information but quickly have no idea why what I see returned is different than the "view source" in a browser. Any advice is greatly appreciated.

Replies are listed 'Best First'.
Re: Web scraping
by Your Mother (Archbishop) on May 08, 2015 at 02:19 UTC

    (LinkedIn) User Agreement

    8.2. Don'ts. You agree that you will not:

    • …Use manual or automated software, devices, scripts robots, other means or processes to access, “scrape,” “crawl” or “spider” the Services or any related data or information;…

    I am guessing your actual problem lies with JavaScript stuff regarding their sessions in which case you would need a JS aware agent like WWW::Mechanize::Firefox or WWW::Selenium but it could be as simple as changing the name of the UserAgent so it’s not a flagged bot/agent name. Still you’re not supposed to do this here and I personally wouldn’t help you because it can cast Perl and its fans in a bad light and as poor Netizens.

Re: Web scraping
by Albannach (Monsignor) on May 08, 2015 at 05:03 UTC
Re: Web scraping
by Gangabass (Vicar) on May 08, 2015 at 03:47 UTC

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1126047]
Approved by Albannach
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2023-12-10 10:06 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (39 votes). Check out past polls.