Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

using TreeBuilder in perl

by anirban0328 (Initiate)
on Jul 29, 2013 at 09:23 UTC ( #1046805=perlquestion: print w/ replies, xml ) Need Help??
anirban0328 has asked for the wisdom of the Perl Monks concerning the following question:

Need help with TreeBuilder::XPath in perl

use strict; use warnings; use LWP::Simple; use HTML::TreeBuilder::XPath; my $url='file:///C:/Users/Rockstar/workspace/abc/globals_func.html +'; my $page = get($url) or die $!; my $p = HTML::TreeBuilder::XPath->new_from_content( $page); my @trips= $p->findnodes( '//div[@class="contents"]'); foreach my $trip (@trips){ print $trip->as_text."\n"; }

After running it in an HTML file,I get output(ALL in one line) as ChainCtrlBuildChain() : ChainController.cChainCtrlDumpChain() : ChainController.cChainCtrlExit() : ChainController.cChainCtrlGetBitStreamChan() : ChainController.cChainCtrlInit() : ChainController.c.

I but want them to be shown as below(one row per value).

ChainCtrlBuildChain() : ChainController.c

ChainCtrlDumpChain() : ChainController.c

ChainCtrlExit() : ChainController.c

ChainCtrlGetBitStreamChan() : ChainController.c

ChainCtrlInit() : ChainController.c.

Kindly help me what am i missing my HTML file(displaying only the HTML code of "contents")

<div class="contents"> &#160;<ul> <li>ChainCtrlBuildChain() : <a class="el" href="_chain_controller_8c.html#acb2c56087a2072b6445 +a54c17662d118">ChainController.c</a> </li> <li>ChainCtrlDumpChain() : <a class="el" href="_chain_controller_8c.html#a13ed5a02bf232b115b9a5 +8cdd13dadd7">ChainController.c</a> </li> <li>ChainCtrlExit() : <a class="el" href="_chain_controller_8c.html#a9e30e46ebc5411537efe9 +5a286e27cb4">ChainController.c</a> </li> <li>ChainCtrlGetBitStreamChan() : <a class="el" href="_chain_controller_8c.html#a00faa6e64ea466d4ec573 +39017e57e71">ChainController.c</a> </li> <li>ChainCtrlInit() : <a class="el" href="_chain_controller_8c.html#aed300a388eff2fa9c7565 +025982faab1">ChainController.c</a> </li> </ul> </div><!-- contents -->

Comment on using TreeBuilder in perl
Select or Download Code
Re: using TreeBuilder in perl (\newlines)
by Anonymous Monk on Jul 29, 2013 at 09:27 UTC

      Oh sorry,forgot to say i tried newlines "\n" earlier.but it didn't help. Basically it is copying everything in single line.While debugging entire content is copied to one row of array @trips.

Re: using TreeBuilder in perl (checklist Dumper)
by Anonymous Monk on Jul 29, 2013 at 09:32 UTC
Re: using TreeBuilder in perl
by choroba (Abbot) on Jul 29, 2013 at 10:10 UTC
    Crossposted on StackOverflow. Note that it is considered polite to inform about crossposting, so people not attending both sites do not waste their effort solving a problem already answered at the other end of the Internets.
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Oh sry !! didn't know about crossposting not allowed. Thanks for the tips and for your time.Problem solved!!

        Crossposting is allowed. Crossposting without announcement is being frowned upon.
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: using TreeBuilder in perl
by Khen1950fx (Canon) on Jul 29, 2013 at 11:55 UTC
    HTML::FormatText does exactly what you want.
    #!/usr/bin/perl use strict; use warnings; use utf8::all; use HTML::FormatText; use HTML::TreeBuilder::XPath; my $page = q{ <div class="contents"> &#160;<ul> <li>ChainCtrlBuildChain() : <a class="el" href="_chain_controller_8c.html#acb2c56087a2072b6445 +a54c17662d118">ChainController.c</a> </li> <li>ChainCtrlDumpChain() : <a class="el" href="_chain_controller_8c.html#a13ed5a02bf232b115b9a5 +8cdd13dadd7">ChainController.c</a> </li> <li>ChainCtrlExit() : <a class="el" href="_chain_controller_8c.html#a9e30e46ebc5411537efe9 +5a286e27cb4">ChainController.c</a> </li> <li>ChainCtrlGetBitStreamChan() : <a class="el" href="_chain_controller_8c.html#a00faa6e64ea466d4ec573 +39017e57e71">ChainController.c</a> </li> <li>ChainCtrlInit() : <a class="el" href="_chain_controller_8c.html#aed300a388eff2fa9c7565 +025982faab1">ChainController.c</a> </li> </ul> </div><!-- contents --> }; my $tree = HTML::TreeBuilder::XPath->new_from_content($page); my(@trees) = $tree->findnodes('//div[@class="contents"]'); foreach $tree(@trees) { my $formatter = HTML::FormatText->new( leftmargim => 0, rightmargin => 50, ); print $formatter->format($tree); }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1046805]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (13)
As of 2014-12-18 13:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (51 votes), past polls