http://www.perlmonks.org?node_id=11157972


in reply to Re^2: Module to extract text from HTML
in thread Module to extract text from HTML

You've inspired a reverse golf challenge, ignore all simple, portable solutions, what's the most convoluted way to achieve the goal :)

Replies are listed 'Best First'.
Re^4: Module to extract text from HTML (Reverse Golf)
by eyepopslikeamosquito (Archbishop) on Feb 29, 2024 at 22:40 UTC
Re^4: Module to extract text from HTML
by Danny (Pilgrim) on Feb 29, 2024 at 15:45 UTC
    my $text = `lynx -nolist -dump 'https://www.perlmonks.org/?node_id=11157915'` :)

      That's far too Effient. The purpose of such a challenge is to deliberately make it convoluted. Think Rube_Goldberg_machine. In real terms, not everyone has lynx, not everyone can install it on their web host.

      Update: added link.

Re^4: Module to extract text from HTML
by bliako (Monsignor) on Feb 29, 2024 at 17:32 UTC

    fair enough. But the problem of converting html to text can be solved with varied success especially if heuristics are applied, so the more options the better. That's why I keep adding to the list, though the mech-to-pdf was more joking than solving.

      Indeed, and my comment wasn't intended as a criticism, rather an opportunity/idea of the inverse golf/Rube Goldberg solution to problems. In so much that code golfing is an exercise, as is a needlessly convoluted one that generates a suitable response.

        agree, and i did not take it as criticism :)