in reply to Re: I want to save web pages as text rather than as HTML. -- oneliner
in thread I want to save web pages as text rather than as HTML.
Now I have Strawberry Perl up and running and the previous TreeBuilder code example now works (using 'http://perl.org' as input).
When I change the input to 'https://wordpress.com/read/feeds/94271045' using the following code:
use strict; use warnings; use LWP::UserAgent; use LWP::Simple; use HTML::TreeBuilder; print HTML::TreeBuilder->new_from_url('https://wordpress.com/read/feed +s/94271045')->as_text;
The output is << WordPress.comPlease enable JavaScript in your browser to enjoy WordPress.com. >>
Do you know how to fix this? One complicating factor is that pages like https://wordpress.com/read/feeds/94271045 won't display properly in my browser unless I'm logged into a WordPress account.
Thanks.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: I want to save web pages as text rather than as HTML. -- oneliner
by Anonymous Monk on Sep 23, 2019 at 06:41 UTC | |
by marto (Cardinal) on Sep 23, 2019 at 08:34 UTC |
In Section
Seekers of Perl Wisdom