Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Script to scrap data

by marto (Cardinal)
on Dec 02, 2024 at 10:29 UTC ( [id://11162968]=note: print w/replies, xml ) Need Help??


in reply to Script to scrap data

The site in question is using the DataTables JavaScript library to populate the table based on a query to the back end, with various parameters, includes the columns you want to display, how many entries to display per page etc. Using your browsers developer tools you can see this query being sent for processing (the entire payload, the url it hits), the results returning as a JSON object. Personally I'd skip the HTML parsing method, automate the interaction with the back end query. I'd use Mojo::UserAgent my go to choice for web work/scraping, send a request with the parameters you want (copy/paste from Developer tools once satisfied) and process the JSON result however you want. Super search will show some interesting results.

Replies are listed 'Best First'.
Re^2: Script to scrap data
by harangzsolt33 (Deacon) on Dec 02, 2024 at 13:25 UTC
    That was my first thought, however when I went to that website and looked at Network traffic in Developer Tools, it showed that it is making two ajax requests. Both of them using this URL: https://desmace.com/wp-admin/admin-ajax.php?action=get_wdtable&table_id=414

    Interestingly, when I copied and pasted this address to see what is being downloaded, it showed nothing. Why is that?

      You're not posting anything along with that request, it's unlikely to perform any action without parameters. You can see what's being sent by clicking the row in question then selecting 'Request' in the right hand pane, listing all of the parameters and values, or right click the row in question -> 'Copy Value' and looking at all the options provided.

        Oh, I see! Thanks! I learned something.
Re^2: Script to scrap data
by Anonymous Monk on Dec 07, 2024 at 20:12 UTC
    Hi again! Thank you for your answers, following them I looked at the DevTools and I think I am close of my goal.

    More precisely, what I need are the code lines to perform the ajax request on <c>admin-ajax.php?action=get_wdtable&table_id=414

    , including the payload data (the bunch of code starting with "draw..." that contents all the values that I have used as input in the table).

    I need to insert this request in my Perl script so I have the JSON response in any human-readable format that I can store in txt, csv or whatever.

    Many thanks in advance!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11162968]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2025-11-18 10:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What's your view on AI coding assistants?





    Results (72 votes). Check out past polls.

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.