http://www.perlmonks.org?node_id=11148398


in reply to WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR

I had a quick glimpse into the docs of ->xpath

and found this passages and emphasized two parts

two insights into potential bottlenecks so:

Of course this is all speculation as long as you can't provide an SSCCE ... :)

Cheers Rolf
(addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^2: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR
by ait (Hermit) on Nov 27, 2022 at 10:27 UTC

    After adding HTML::Tree and parsing some stuff in pure Perl land I think that IS actually the right approach:

    1. Use W::M::Chrome for JS rendering, JS interactions and high-level xpath
    2. Slurp HTML chunks and process in the Perl side as much as possible

      That's one approach.

      But as I said I think putting the logic into a more elaborate xpath to do the heavy lifting inside the browser would fix your performance issue without needing HTML::Tree

      IMHO your code will force the Perl part in W:M:C to do a lot of own filtering and create thousands of proxy objects. These Perl objects will also tunnel requests back and forth to the browser for most method calls.

      Hence many potential bottlenecks.

      update

      as an illustration, this xpath in chrome's dev console for https://meta.wikimedia.org/wiki/Wikipedia_article_depth returns 1016 strings at once

      //table[3]//tr//td//text()

      Disclaimer: I don't have W:M:C installed and my xpath foo is rusted, so I'm pretty sure there are even better ways to do it.

      Cheers Rolf
      (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
      Wikisyntax for the Monastery

        True.