Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Streaming Market Quotes from Ally Invest

by talexb (Canon)
on Jan 31, 2019 at 21:27 UTC ( #1229209=note: print w/replies, xml ) Need Help??

in reply to Streaming Market Quotes from Ally Invest

I'm writing a script that logs on to a website, does some navigation, clicks on a button that generates and then downloads a CSV, and I'm wondering if the add_handler method will do what I need to capture this CSV. I had a look on WWW::Mechanize, but didn't see any information on this method.

Right now I'm calling the content method after the click, and that just gives me the web page and not the CSV that arrives a few seconds later. Is there more information about this method?

Alex / talexb / Toronto

Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Replies are listed 'Best First'.
Re^2: Streaming Market Quotes from Ally Invest
by Your Mother (Bishop) on Jan 31, 2019 at 21:50 UTC

    I suspect(?) there is some JS involved if doing a $mech->click on the download link is not getting the CSV file. That kind of thing should "just work" and if it doesn't, putting in handlers won't make any difference. You might check corion's WWW::Mechanize::Chrome or WWW::Mechanize::Firefox to see if it works more like the browser.

      I just walked to the grocery store and back while thinking about it -- and I suspect you are correct. I'm able to log in and navigate using Mech, even though I see some Javascript on the page, but I think this download is asynchronous, thus it relies on JS, thus Mech can't catch it. Ugh.

      I'd love to be able to try something like headless Chrome, but I don't think I'll be able to persuade my client to take that route. The low-tech alternative would be to have someone manually download the file every couple of days, and then upload it to the local server (which I manage). Thanks for your reply, and your original post -- it's always good to read about success stories.

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

        It is also possible to capture the entire exchange (the dev panel(s) can do it) and see exactly what the JS is doing for headers and cookies and such. It should be pretty easy to then emulate the requests with plain HTTP in Perl code. It might be fragile but it's probably not too hard. Write a test and put it on a nightly cron to alert you if/when it breaks.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1229209]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2019-07-22 12:32 GMT
Find Nodes?
    Voting Booth?
    If you were the first to set foot on the Moon, what would be your epigram?

    Results (16 votes). Check out past polls.