Galdor has asked for the wisdom of the Perl Monks concerning the following question:

I've been given some web scraping tasks - Found Watir but I would rather use existing Perl experience than learn Ruby from scratch: I would like the solution to integrate with Chrome + brwoser automation, and some Perl Framework like Mojolicious. Any tips to get me started?

Replies are listed 'Best First'.
Re: Perl equivalent of Watir -- browser automation and web scraping
by Discipulus (Abbot) on Oct 15, 2021 at 11:03 UTC
    Hello Galdor,

    The browser automation can be done with WWW::Mechanize::Chrome see recent posts about it and search for Corion's examples here at perlmonks.

    Then you can use Mojo::DOM (it supports CSS selectors!) to parse a document.

    Also of interest: Web::Scraper

    See some link of mines


    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      For context you provided a reference to your links last time.

Re: Perl equivalent of Watir
by marto (Cardinal) on Oct 15, 2021 at 12:52 UTC
Re: Perl equivalent of Watir
by Takeshi Kovacs (Beadle) on Oct 15, 2021 at 13:03 UTC
    After reading the previous discussions listed in this thread it seems you like asking kind of the same question over and over again?

    Anyway: I don't know "Watir", never heard of. But if I follow the non-link you gave I read immediately "Powered by Selenium"

    That's straightforward. There are some selenium modules on CPAN which should answer this question for good. The API for Selenium should be "equivalent" independently from the client language.

    See also Selenium (software) if you don't know what's about.

Re: Perl equivalent of Watir
by perlfan (Vicar) on Oct 15, 2021 at 14:38 UTC
    the CPAN module for Playwright is created and maintained by the same CPAN author (TEODESIAN) that has maintained the Selenium driver for years; he vastly prefers Playwright these days. Firefox::Marionette is also a browser specific module that talks directly to .. you guessed it :-). I've had great success using it with Web::Scraper when needing to access javascript rendered pages.

    Playwright more or less generalizes talking to browser via the "wire protocol" each of them maintain. Note: the Perl module uses the standard nodejs based daemon as the middle ware, but gives it a nice Perl interface and other goodies. Another thing to note is that Playwrite is from Microsoft and is generally considered very well done with lots of great docs.
Re: Perl equivalent of Watir
by karlgoethebier (Abbot) on Oct 15, 2021 at 19:26 UTC