http://www.perlmonks.org?node_id=450244


in reply to save a page as text

No need to involve a browser at all. Here's one way, using the excellent HTML::TokeParser::Simple by the monastery's own Ovid.

#!/usr/bin/perl use warnings; use strict; use LWP::Simple; use HTML::TokeParser::Simple; my $page=get('http://www.page.you.want.com/some/path'); my $p = HTML::TokeParser::Simple->new( \$page ); while ( my $token = $p->get_token ) { # This prints all text in an HTML doc (i.e., it strips the HTML) next unless $token->is_text; print $token->as_is; }

Stuffing it into a file is left as an exercise for the poster.

-Any sufficiently advanced technology is
indistinguishable from doubletalk.

My Biz