Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

From a given text Extract the root HTML element inner text

by DEIVEEGARAJA (Novice)
on Dec 06, 2012 at 10:39 UTC ( #1007527=perlquestion: print w/ replies, xml ) Need Help??
DEIVEEGARAJA has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks, I just want to know the possible ways to do this task.

From a given text Extract the root HTML element inner text

Eg.

<div id="header" title=""> <!-- Header Placement #1 --> <div id="headerPre1"></div> <div id="headerWrap1"> <div id="ctl00_header_logo_pnlWrapper" class="logo"> <a class="image1" href="/default.aspx">AAAAAA</a> </div> </div>

Here my search term text is AAAAAA & I need the output like

<div id="ctl00_header_logo_pnlWrapper" class="logo"> <a class="image1" href="/default.aspx">AAAAAA</a> </div>

Comment on From a given text Extract the root HTML element inner text
Select or Download Code
Re: From a given text Extract the root HTML element inner text
by choroba (Abbot) on Dec 06, 2012 at 11:00 UTC
    Using XML::XSH2:
    open :F html 1.html ; cd //text()[.="AAAAAA"] ; ls ancestor::div[1] ;
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: From a given text Extract the root HTML element inner text
by kcott (Abbot) on Dec 06, 2012 at 11:29 UTC

    G'day DEIVEEGARAJA,

    This is PerlMonks! It is a forum for discussing and asking questions about Perl. It is not a free code writing service.

    You have posted nothing that relates to Perl. The only thing I see that relates to any programming language is default.aspx (in two places). Perhaps you'd have better luck with an Active Server Page Extended forum.

    If you genuinely have a Perl question, but perhaps just worded your posting poorly, you'll find guidelines for crafting a better question here: How do I post a question effectively?

    -- Ken

Re: From a given text Extract the root HTML element inner text
by Anonymous Monk on Dec 06, 2012 at 14:56 UTC
Re: From a given text Extract the root HTML element inner text
by marquezc329 (Scribe) on Dec 06, 2012 at 18:02 UTC
    Hello DEIVEEGARAJA, Multiple modules on CPAN are available for html parsing. You may find HTML::TokeParser to be useful. Example usage to print links from a page's source.
    use HTML::TokeParser; my $p = HTML::TokeParser->new("webpage") or die "Can't open webpage: $!\n"; while (my $token = $p->get_tag('a')) { my $link =$token->[1]{href}; my $text = $p->get_trimmed_text('/a'); say "Link: $link"; say "Text: $text\n"; }

    What have you already tried? Questions are better answered when supplemented with code.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1007527]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (13)
As of 2014-09-18 16:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (118 votes), past polls