Beefy Boxes and Bandwidth Generously Provided by pair Networks Ovid
Don't ask to ask, just ask

Extracting text between HTML comments tags

by Anonymous Monk
on May 14, 2004 at 15:44 UTC ( #353401=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have the following HTML
<!--BEGIN -->blah blah blooh blooh<!--END -->
How can I extract the text between the comments and put it into a variable $var.

Comment on Extracting text between HTML comments tags
Select or Download Code
Re: Extracting text between HTML comments tags
by tinita (Parson) on May 14, 2004 at 15:53 UTC
    my @matches = $html =~ m/<!--BEGIN -->(.*?)<!--END -->/gs;
    this allows you to have several such tags in one html string.
    if you have just one you can speed this up by dropping the /g and the ? from .*?
    for details please read perlre, perlretut and similar perldocs.
Re: Extracting text between HTML comments tags
by gryphon (Abbot) on May 14, 2004 at 16:02 UTC

    Greetings Anonymous Monk,

    Take a look at Template::Extract. It lets you scrape content from an HTML source without having to deal with regular experessions or parsing through tokens. (Not that I'm against HTML::TokeParser or anything. It's just that sometimes it's easier to use something else.) You copy and paste sections of HTML from which you want to extract data, put it into a tempalte file, call-out the data you want, and let the module do the work.

    use Template::Extract; my $extract = Template::Extract->new(); my $template = '<!--BEGIN -->[% mydata %]<!--END -->'; my $content = '<!--BEGIN -->blah blah blooh blooh<!--END -->'; my $data = $extract->extract($template, $content); use Data::Dumper; print Dumper $data;

    This is definately not going to be the fastest and probably not even the "best" way to parse your content. However, it's really, really easy. And I always say that the computer's time is cheaper than mine; make it do the work.

    code('Perl') || die;

Re: Extracting text between HTML comments tags
by theorbtwo (Prior) on May 14, 2004 at 16:40 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://353401]
Approved by Paladin
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2014-04-20 03:38 GMT
Find Nodes?
    Voting Booth?

    April first is:

    Results (485 votes), past polls