Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Read text from HTML file and display it

by ced4dad (Initiate)
on Oct 28, 2012 at 17:15 UTC ( #1001279=perlquestion: print w/ replies, xml ) Need Help??
ced4dad has asked for the wisdom of the Perl Monks concerning the following question:

Gentlemen... I am attempting to read formatted HTML from a .txt file, as an entire file, or as a line by line read. I then wish to display the content as simple text, exactly as it was in the file. In attempting to bring the data in as a string or entire page, PRINTING it to the screen yields the formatted HTML output and NOT only the ASCII content of the file. Is there an alternative to PRINT or PRINTF that would yield only the ASCII text being shown on the screen? The content of the file is less than 2000 characters.

Comment on Read text from HTML file and display it
Re: Read text from HTML file and display it
by Anonymous Monk on Oct 28, 2012 at 19:45 UTC
Re: Read text from HTML file and display it
by Kenosis (Priest) on Oct 28, 2012 at 23:21 UTC

    Depending upon the complexity of the html, you may find that HTML::Scrubber will do what you need:

    use strict; use warnings; use File::Slurp qw/read_file/; use HTML::Scrubber; my $html = read_file 'text.txt'; my $scrubber = HTML::Scrubber->new(); print $scrubber->scrub($html);

    File::Slurp was only used to read the file's contents into a scalar.

    Hope this helps!

Re: Read text from HTML file and display it
by Khen1950fx (Canon) on Oct 29, 2012 at 02:47 UTC
    Another way to do it:
    #!/usr/bin/perl BEGIN { $| = 1; $^W = 1; } use strict; use autodie; use warnings; use HTML::TreeBuilder; use HTML::FormatText; my $html = shift @ARGV; my $tree = HTML::TreeBuilder->new->parse_file($html); my $formatter = HTML::FormatText->format_file( $html, leftmargin => 0, rightmargin => 50, ); print $formatter;
Re: Read text from HTML file and display it
by thomas895 (Hermit) on Oct 29, 2012 at 04:33 UTC

    So, you mean like this?

    use CGI; print CGI::header( "text/html" ), CGI::escapeHTML( $data_from_somewhere );
    ~Thomas~ 
    "Excuse me for butting in, but I'm interrupt-driven..."
Re: Read text from HTML file and display it
by 2teez (Priest) on Oct 29, 2012 at 05:51 UTC

    OR, if you have lynx on your system,
    you can do like so:

    use warnings; use strict; my $filename = $ARGV[0]; my $ascii = system("lynx -dump $filename"); print $ascii;

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
      use warnings; use strict; my $filename = $ARGV[0]; my $ascii = system("lynx -dump $filename"); print $ascii;

      Your $ascii variable will contain the return value, a number, from system which is not what you seem to be expecting. Backticks or qx() will return what normally goes to STDOUT.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1001279]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2014-11-26 23:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (177 votes), past polls