comment on

You can also use HTML::TokeParser::Simple. I'll leave the regex of the text up to you. :-)

P.S. HTML::Strip may also be useful to you, see Stripping HTML tags efficiently

#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Request::Common qw(GET);
use HTML::TokeParser::Simple;

my $ua = LWP::UserAgent->new;

# Define user agent type
$ua->agent('MyApp/0.1 ');

# Request object
my $req = GET 'http://finance.yahoo.com/actives?e=us';

# Make the request
my $res = $ua->request($req);

my $con = $res->content;

#print "$con\n";

my $p = HTML::TokeParser::Simple->new( \$con );

while ( my $token = $p->get_token ) {
    # This prints all text in an HTML doc (i.e., it strips the HTML)
        next unless $token->is_text;
    print $token->as_is, "\n";
     }

exit 0;
[download]

I'm not really a human, but I play one on earth.
Old Perl Programmer Haiku ................... flash japh

In reply to Re: regexp solutions by zentara
in thread regexp solutions by programmer.perl

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Clear questions and runnable code get the best and fastest answer
	PerlMonks