http://www.perlmonks.org?node_id=11106030


in reply to Re: I want to save web pages as text rather than as HTML. -- oneliner
in thread I want to save web pages as text rather than as HTML.

Now I have Strawberry Perl up and running and the previous TreeBuilder code example now works (using 'http://perl.org' as input).

When I change the input to 'https://wordpress.com/read/feeds/94271045' using the following code:

use strict; use warnings; use LWP::UserAgent; use LWP::Simple; use HTML::TreeBuilder; print HTML::TreeBuilder->new_from_url('https://wordpress.com/read/feed +s/94271045')->as_text;

The output is << WordPress.comPlease enable JavaScript in your browser to enjoy WordPress.com. >>

Do you know how to fix this? One complicating factor is that pages like https://wordpress.com/read/feeds/94271045 won't display properly in my browser unless I'm logged into a WordPress account.

Thanks.

Replies are listed 'Best First'.
Re^3: I want to save web pages as text rather than as HTML. -- oneliner
by Anonymous Monk on Sep 23, 2019 at 06:41 UTC