Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Extracting text between HTML comments tags

by Anonymous Monk
on May 14, 2004 at 15:44 UTC ( #353401=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have the following HTML
<!--BEGIN -->blah blah blooh blooh<!--END -->
How can I extract the text between the comments and put it into a variable $var.

Replies are listed 'Best First'.
Re: Extracting text between HTML comments tags
by gryphon (Abbot) on May 14, 2004 at 16:02 UTC

    Greetings Anonymous Monk,

    Take a look at Template::Extract. It lets you scrape content from an HTML source without having to deal with regular experessions or parsing through tokens. (Not that I'm against HTML::TokeParser or anything. It's just that sometimes it's easier to use something else.) You copy and paste sections of HTML from which you want to extract data, put it into a tempalte file, call-out the data you want, and let the module do the work.

    use Template::Extract; my $extract = Template::Extract->new(); my $template = '<!--BEGIN -->[% mydata %]<!--END -->'; my $content = '<!--BEGIN -->blah blah blooh blooh<!--END -->'; my $data = $extract->extract($template, $content); use Data::Dumper; print Dumper $data;

    This is definately not going to be the fastest and probably not even the "best" way to parse your content. However, it's really, really easy. And I always say that the computer's time is cheaper than mine; make it do the work.

    code('Perl') || die;

Re: Extracting text between HTML comments tags
by tinita (Parson) on May 14, 2004 at 15:53 UTC
    my @matches = $html =~ m/<!--BEGIN -->(.*?)<!--END -->/gs;
    this allows you to have several such tags in one html string.
    if you have just one you can speed this up by dropping the /g and the ? from .*?
    for details please read perlre, perlretut and similar perldocs.
Re: Extracting text between HTML comments tags
by theorbtwo (Prior) on May 14, 2004 at 16:40 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://353401]
Approved by Paladin
[karlgoethebier]: misses the brandy on the sideboard

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2017-04-26 18:34 GMT
Find Nodes?
    Voting Booth?
    I'm a fool:

    Results (488 votes). Check out past polls.