vpperl has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, My first post here, so hope I'm following all rules, as well as I'm not asking stipid questions :)

Guys, I'm trying to extract the content of a website, which is implemented with DOM objects, so the pure HTML parsing is not giving me any favor. For example, I'm trying to extract all player names from the following team:


Any suggestions/ideas/examples where to start with? I have some Perl experience, but haven't been able to do find anything that would help me with. Thanks!

Replies are listed 'Best First'.
Re: Scrap shadow DOM website with Perl
by Corion (Pope) on Jan 20, 2021 at 13:45 UTC

    Instead of scraping the page, I would look at decoding the Websocket communication that this site has with the backend.

      That sound interesting, also having in mind that I would like to do this for more than one team. Any starting point with this idea?

        Personally, I like to look at the data first using the developer tools of Firefox or Chrome, and then implement the Websocket client using Mojo::UserAgent.

Re: Scrap shadow DOM website with Perl
by marto (Cardinal) on Jan 20, 2021 at 13:40 UTC

    Do you mean player names as in the people playing the sport or those listed as playing the Fantasy game listed in the 'Names' column on the bottom left hand side?

      Hi, I mean the football players on the right hand side in the "football field".

        Using your browsers developer tools, refresh the page and do a search for one of the player surnames, it reveals how their API calls work and how the data is returned.