Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: I want to save web pages as text rather than as HTML. -- oneliner

by anautismobserver (Sexton)
on Sep 10, 2019 at 21:18 UTC ( #11105989=note: print w/replies, xml ) Need Help??


in reply to Re: I want to save web pages as text rather than as HTML. -- oneliner
in thread I want to save web pages as text rather than as HTML.

Thanks for all that info. It's a lot to digest.

Despite the elegance of a one-liner, I prefer to take one step at a time.

When I try to run the following code:

use strict; use warnings; use LWP::UserAgent; use LWP::Simple; use HTML::TreeBuilder; print HTML::TreeBuilder->new_from_url('http://perl.org')->as_text;

I get the error message << Can't locate object method "new_from_url" via package "HTML::TreeBuilder" >>

What else do I need to add to the code to make it work?

  • Comment on Re^2: I want to save web pages as text rather than as HTML. -- oneliner
  • Download Code

Replies are listed 'Best First'.
Re^3: I want to save web pages as text rather than as HTML. -- oneliner
by Your Mother (Bishop) on Sep 10, 2019 at 21:25 UTC

    Maybe you have a really old version and need an update. The method was added 2012-06-12 according to its change file. The example as you posted it works fine for me; relatively current Perl installation with HTML::TB version 5.03 on OS X.

      I was using Padre and DWIM Perl, following Gabor Szabo's instructions at https://perlmaven.com/installing-perl-and-getting-started

      I have just uninstalled DWIM Perl and installed Strawberry Perl 5.30.0.1 (64bit) for Windows into C:\Strawberry on my hard drive.

      The README.txt file tells me to run the following commands to manually set some environment variables:

      c:\myperl\relocation.pl.bat ... this is REQUIRED!

      c:\myperl\update_env.pl.bat ... this is OPTIONAL

      When I tried to run c:\Strawberry\relocation.pl.bat I got the error message "The system cannot find the path specified." Hovever, there was a file "relocation.txt" in C:\Strawberry which appeared to run successfully. I can't find any files similar to update_env.pl.bat however.

      I liked the convenince of Padre, but don't want to run into more problems due to not being kept up to date. I also have Notepad++. What to you recommend?

      Thanks for all your help. I'll probably have more questions later.

        Iím really glad people work on Perl on Windows but I donít have the patience for it. It makes everything harder. I tried Padre way back when it was in a very early version and liked what I saw but as far as IDEs goÖ I don't care for them. Forced to choose, maybe Atom, or TextMate on Mac, but I am emacs and I would personally only recommend emacs or vim because they are a baseline you can rely on.

        Try the free Komodo Edit. It's a really excellent Perl IDE with realtime syntax checking, fast and rich interface, hundreds of options, very pleasant colors, snippets, templates, completion, tidy, reliable backups, perfect restorations, support for about a hundred other languages, and lots more.

        One of my favorite features: A few seconds after making a typo or other mistake the lines that will be broken by it are underlined in red, and will popup the warning text on mouseover.

        https://www.activestate.com/products/komodo-ide/downloads/edit/
        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11105989]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2020-01-26 21:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?