Re: XPath to XML (xpath2html)

yup, see xsh, see site:perlmonks.org "by choroba" xsh, ?node_id=3989;BIT=xml%3A%3Axsh,

Re: XML::Simple usage question (steriods), Re^2: Extracting data-structure from HTML using Web::Scraper, Re: How to read onclick properties on row of a table using HTML::Table::Extractor ( DOM approach using xsh)

Found 50 nodes roughly between 2013-05-17 and 2011-09-27 (searched 10.24% of DB).

where any text contains "xml::xsh"

2013-04-04 choroba Re: LibXML: Change a node into a comment Re:SoPW

2013-04-02 Anonymous Monk Re^3: Some issues with WWW::Mechanize::Firefox->xpath() method (xpath 1.0) Re:SoPW

2013-04-02 choroba Re^2: Some issues with WWW::Mechanize::Firefox->xpath() method (xpath 1.0) Re:SoPW

2013-03-21 choroba Re: LibXML, XPath and Namespaces Re:SoPW

2013-01-11 choroba Re: Perl - Modify the nested XML tags Re:SoPW

2013-01-07 choroba Re: to get next line of pattern matched Re:SoPW

2013-01-07 choroba Re: regular expression Re:SoPW

2012-12-14 choroba Re: Creating nested elements in XML::Smart Re:SoPW

2012-12-14 choroba Re: Fetch field values from API output Re:SoPW

2012-12-06 choroba Re: From a given text Extract the root HTML element inner text Re:SoPW

2012-11-22 choroba Re: Finding max value from a unique tag from XML Re:SoPW

2012-11-21 choroba Re: Adding Elements to XML Re:SoPW

2012-11-20 choroba Re: hi i want to retrieve the element and values from xml document Re:SoPW

2012-11-16 choroba Re: XML Newbie Re:SoPW

2012-10-16 sundialsvc4 Re: Search Entire Excel Workbook For Text Re:SoPW

2012-10-16 choroba Re: Some questions from beginning user of XML::LibXML and XPath Re:SoPW

2012-10-12 choroba Re^2: XML::Simple XML / XMLin / XMLout? or something else? Re:SoPW

2012-10-05 choroba Re: Struggling with XML Re:SoPW

2012-05-28 choroba Re: XML - Escaping characters from database for XML Re:SoPW

2012-05-25 choroba Re: Need help for Xpath patterns Re:SoPW

2012-04-23 choroba Re: finding each and every node of a xml document Re:SoPW

2012-04-19 choroba Re: searching a empty XML tag or self enclosing tags Re:SoPW

2012-03-01 choroba Re: Remove level of elements (preserving their children) in XML::Twig? Re:SoPW

2012-02-14 choroba Re: libxml - insert node Re:SoPW

2012-02-10 choroba Re: conditional input field separator? Re:SoPW

2012-01-26 choroba Re^2: parsing multi level XML with XML::Simple Re:SoPW

2012-01-15 choroba Re: Is there any XML reader like this? Re:SoPW

2012-01-15 trwww Re^2: RFC iEngine Re:Med

2012-01-13 repellent Re: RFC iEngine Re:Med

2012-01-12 choroba Re: Building a tree 1 leaf at a time Re:SoPW

2011-12-19 choroba Re: XML::Twig - Using xpath with twig roots Re:SoPW

2011-12-09 choroba Re: how to get attribute values and store in a hash. Re:SoPW

2011-11-30 choroba Re: compacting XML? Re:SoPW

2011-11-18 choroba Re: Modify XML tags Re:SoPW

2011-11-18 choroba Re: Multiple XML files from Directory to One XML file using perl. Re:SoPW

2011-11-14 choroba Re: Match on line, read backwards to opening xml tag then forward to closing tag Re:SoPW

2011-11-04 choroba Re: help me with perl script that add xml attritutes Re:SoPW

2011-11-01 vagabonding electron Re^6: How to get paired values from the nested XML structure? Re:SoPW

2011-11-01 marto Re^5: How to get paired values from the nested XML structure? Re:SoPW

2011-11-01 marto Re^3: How to get paired values from the nested XML structure? Re:SoPW

2011-11-01 vagabonding electron Re^2: How to get paired values from the nested XML structure? Re:SoPW

2011-11-01 choroba Re: How to get paired values from the nested XML structure? Re:SoPW

2011-10-12 choroba Re: How to get empty tag value in XML::XPath Re:SoPW

2011-10-07 choroba Re: Problem in String Replacement Re:SoPW

2011-10-04 veerubiji Re^2: perl script to print xml data like this Re:SoPW

2011-10-04 choroba Re: perl script to print xml data like this Re:SoPW

2011-09-30 choroba Re: Hash Table genaration using perl Re:SoPW

2011-09-29 choroba Re: How can I replace a line (tag) in an XML file? Re:SoPW

2011-09-27 choroba Re: XML::Twig removing tags from content Re:SoPW

2011-09-27 choroba Re: Replacing XML Tag name with another Re:SoPW

Found 20 nodes roughly between 2011-09-27 and 2008-12-04 (searched 19.35% of DB).

where any text contains "xml::xsh"

2011-08-26 Anonymous Monk Re: Search and replace query Re:SoPW

2011-08-19 choroba Re: Question about XML::DOM::Lite Re:SoPW

2011-07-15 Anonymous Monk Re: XML::LibXML - WHAR HASH TREES WHAR?! Re:SoPW

2011-06-25 choroba Re: How do I ignore comments in an xml file when using win32::ole? Re:SoPW

2011-06-07 choroba Re: Group XML Re:SoPW

2011-05-10 choroba Re: Replace MathML content using Twig Re:SoPW

2011-05-03 choroba Re: Missing values in XML::Twig Output Re:SoPW

2011-04-21 choroba Re: How do I get LibXML to replace attribute values? Re:SoPW

2011-04-12 choroba Re: Changing XML Tag value in Perl Script Re:SoPW

2011-04-12 choroba Re: perl xpath extraction Re:SoPW

2011-04-05 choroba Re: Modify XML Re:SoPW

2011-03-25 choroba Re: regex replace using position loop Re:SoPW

2010-12-03 choroba Re: XPathing Up level Re:SoPW

2010-11-11 choroba Re: XML::XPath - node-to-xpath reverse lookup Re:SoPW

2010-10-29 choroba Re: XPath command line utility... Re:SoPW

2010-10-26 choroba Re: Perl and Lib::XML usage Re:SoPW

2010-10-22 choroba Re: Having problems accessing individual attributes in xml Re:SoPW

2010-09-14 choroba Re: Deleting XML element using XML::LibXML Re:SoPW

2010-08-12 choroba Re: Sort xml based on attribute Re:SoPW

2010-08-05 choroba Re: How I can change value in XML file ? Re:SoPW

Found 2 nodes roughly between 2008-12-04 and 2006-03-20 (searched 18.38% of DB).

where any text contains "xml::xsh"

2007-03-05 merlyn Re: XML gurus unite!! Re:SoPW

2007-02-13 merlyn Get most recently refreshed CPAN mirror in your country Snippet

Found 7 nodes roughly between 2006-03-20 and 2005-03-09 (searched 9.67% of DB).

where any text contains "xml::xsh"

2005-10-13 saintmike Re: html analysis tool via regex Re:SoPW

2005-08-22 merlyn Re: Looking for a XPATH-like tool for HTML documents Re:SoPW

2005-05-20 rg0now Re^3: xQuery functionality in Perl? Re:SoPW

2005-05-11 rg0now Re: XML Parsing Suggestions? Re:SoPW

2005-05-08 merlyn Re: Replacing everything in between using s///; Re:SoPW

2005-04-08 rg0now Re: Parsing XML/HTML Re:SoPW

2005-03-21 merlyn Re: XML::Simple "transforming data" Re:SoPW

Found 6 nodes roughly between 2005-03-09 and 2003-06-22 (searched 16.44% of DB).

where any text contains "xml::xsh"

2004-08-26 merlyn �Re: Delete from string through s/// Re:SoPW

2004-06-01 ambrus ambrus's scratchpad SPad

2004-04-12 merlyn �Re: Just use an XSLT stylesheet Re:Med

2003-10-27 princepawn HTML Templating as Tree Rewriting: Part I: "If Statements" Med

2003-10-22 merlyn Screen-scraping using XSH - O'Reilly Animal lister Code

2003-08-18 merlyn �Re: XML::XPath Re:SoPW

Found 1 node roughly between 2003-06-22 and 2001-06-13 (searched 17.41% of DB).

where any text contains "xml::xsh"

2002-12-01 larsen Re: tgrep - A grep for XML/HTML tags Re:Code

and see xpath2html. I toyed with it a few years ago, couldn't make a go of it with XML::XPathEngine but this worked about as well as I needed. You can ditch HTML::Element for proper DOM api like XML::LibXML::Element

xpath2html

#!/usr/bin/perl --
use strict;
use warnings;
use HTML::Element;
{
    my $root;
    my $current;
    for my $step (
        grep length,
        split '/',
q!/html/body/div[@id='wrapper']/div[@id='outer']/div[@id='inner']/div[
+@id='center']/div[@id='main']/div[2]/table[@id='wrappedcontent']/tbod
+y/tr/td/table/tbody/tr[2]/td[2]!,
      )
    {
        my ( $tag, $att ) = $step =~ /^([^\[]+)\[?(.*?)\]?$/;
        warn "step($step)tag($tag)att($att) \n";

        if ( $current and $root ) {
            if ( $att =~ /^\d+$/ ) {
                my $new;
                for my $n ( 1 .. $att ) {
                    $new = HTML::Element->new($tag,
                    ncount => $n
                    );
                    $current->push_content($new);
                }
                $current = $new;
            } elsif( $att =~/\@(\w+)(?:[^=]*=['"]*([^'"]+)['"]*)?$/ ) 
+{
                my $new = HTML::Element->new($tag, $1 => $2 );
                $current->push_content($new);
                $current = $new;
            } else {
                my $new = HTML::Element->new($tag);
                $current->push_content($new);
                $current = $new;
            }
        }
        else {
            $root    = HTML::Element->new( $tag );
            $current = $root;
        }
    }
    undef $current;
    print $root->as_HTML( '><&' => "    " );
    $root->delete;
    undef $root;
}

__END__
step(html)tag(html)att()
step(body)tag(body)att()
step(div[@id='wrapper'])tag(div)att(@id='wrapper')
step(div[@id='outer'])tag(div)att(@id='outer')
step(div[@id='inner'])tag(div)att(@id='inner')
step(div[@id='center'])tag(div)att(@id='center')
step(div[@id='main'])tag(div)att(@id='main')
step(div[2])tag(div)att(2)
step(table[@id='wrappedcontent'])tag(table)att(@id='wrappedcontent')
step(tbody)tag(tbody)att()
step(tr)tag(tr)att()
step(td)tag(td)att()
step(table)tag(table)att()
step(tbody)tag(tbody)att()
step(tr[2])tag(tr)att(2)
step(td[2])tag(td)att(2)
<html>
    <body>
        <div id="wrapper">
            <div id="outer">
                <div id="inner">
                    <div id="center">
                        <div id="main">
                            <div ncount="1">
                            </div>
                            <div ncount="2">
                                <table id="wrappedcontent">
                                    <tbody>
                                        <tr>
                                            <td>
                                                <table>
                                                    <tbody>
                                                        <tr ncount="1"
+>
                                                        </tr>
                                                        <tr ncount="2"
+>
                                                            <td ncount
+="1">
                                                            </td>
                                                            <td ncount
+="2">
                                                            </td>
                                                        </tr>
                                                    </tbody>
                                                </table>
                                            </td>
                                        </tr>
                                    </tbody>
                                </table>
                            </div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </body>
</html>
[download]

In Section Seekers of Perl Wisdom

2013-04-04	choroba	Re: LibXML: Change a node into a comment	Re:SoPW
2013-04-02	Anonymous Monk	Re^3: Some issues with WWW::Mechanize::Firefox->xpath() method (xpath 1.0)	Re:SoPW
2013-04-02	choroba	Re^2: Some issues with WWW::Mechanize::Firefox->xpath() method (xpath 1.0)	Re:SoPW
2013-03-21	choroba	Re: LibXML, XPath and Namespaces	Re:SoPW
2013-01-11	choroba	Re: Perl - Modify the nested XML tags	Re:SoPW
2013-01-07	choroba	Re: to get next line of pattern matched	Re:SoPW
2013-01-07	choroba	Re: regular expression	Re:SoPW
2012-12-14	choroba	Re: Creating nested elements in XML::Smart	Re:SoPW
2012-12-14	choroba	Re: Fetch field values from API output	Re:SoPW
2012-12-06	choroba	Re: From a given text Extract the root HTML element inner text	Re:SoPW
2012-11-22	choroba	Re: Finding max value from a unique tag from XML	Re:SoPW
2012-11-21	choroba	Re: Adding Elements to XML	Re:SoPW
2012-11-20	choroba	Re: hi i want to retrieve the element and values from xml document	Re:SoPW
2012-11-16	choroba	Re: XML Newbie	Re:SoPW
2012-10-16	sundialsvc4	Re: Search Entire Excel Workbook For Text	Re:SoPW
2012-10-16	choroba	Re: Some questions from beginning user of XML::LibXML and XPath	Re:SoPW
2012-10-12	choroba	Re^2: XML::Simple XML / XMLin / XMLout? or something else?	Re:SoPW
2012-10-05	choroba	Re: Struggling with XML	Re:SoPW
2012-05-28	choroba	Re: XML - Escaping characters from database for XML	Re:SoPW
2012-05-25	choroba	Re: Need help for Xpath patterns	Re:SoPW
2012-04-23	choroba	Re: finding each and every node of a xml document	Re:SoPW
2012-04-19	choroba	Re: searching a empty XML tag or self enclosing tags	Re:SoPW
2012-03-01	choroba	Re: Remove level of elements (preserving their children) in XML::Twig?	Re:SoPW
2012-02-14	choroba	Re: libxml - insert node	Re:SoPW
2012-02-10	choroba	Re: conditional input field separator?	Re:SoPW
2012-01-26	choroba	Re^2: parsing multi level XML with XML::Simple	Re:SoPW
2012-01-15	choroba	Re: Is there any XML reader like this?	Re:SoPW
2012-01-15	trwww	Re^2: RFC iEngine	Re:Med
2012-01-13	repellent	Re: RFC iEngine	Re:Med
2012-01-12	choroba	Re: Building a tree 1 leaf at a time	Re:SoPW
2011-12-19	choroba	Re: XML::Twig - Using xpath with twig roots	Re:SoPW
2011-12-09	choroba	Re: how to get attribute values and store in a hash.	Re:SoPW
2011-11-30	choroba	Re: compacting XML?	Re:SoPW
2011-11-18	choroba	Re: Modify XML tags	Re:SoPW
2011-11-18	choroba	Re: Multiple XML files from Directory to One XML file using perl.	Re:SoPW
2011-11-14	choroba	Re: Match on line, read backwards to opening xml tag then forward to closing tag	Re:SoPW
2011-11-04	choroba	Re: help me with perl script that add xml attritutes	Re:SoPW
2011-11-01	vagabonding electron	Re^6: How to get paired values from the nested XML structure?	Re:SoPW
2011-11-01	marto	Re^5: How to get paired values from the nested XML structure?	Re:SoPW
2011-11-01	marto	Re^3: How to get paired values from the nested XML structure?	Re:SoPW
2011-11-01	vagabonding electron	Re^2: How to get paired values from the nested XML structure?	Re:SoPW
2011-11-01	choroba	Re: How to get paired values from the nested XML structure?	Re:SoPW
2011-10-12	choroba	Re: How to get empty tag value in XML::XPath	Re:SoPW
2011-10-07	choroba	Re: Problem in String Replacement	Re:SoPW
2011-10-04	veerubiji	Re^2: perl script to print xml data like this	Re:SoPW
2011-10-04	choroba	Re: perl script to print xml data like this	Re:SoPW
2011-09-30	choroba	Re: Hash Table genaration using perl	Re:SoPW
2011-09-29	choroba	Re: How can I replace a line (tag) in an XML file?	Re:SoPW
2011-09-27	choroba	Re: XML::Twig removing tags from content	Re:SoPW
2011-09-27	choroba	Re: Replacing XML Tag name with another	Re:SoPW

2011-08-26	Anonymous Monk	Re: Search and replace query	Re:SoPW
2011-08-19	choroba	Re: Question about XML::DOM::Lite	Re:SoPW
2011-07-15	Anonymous Monk	Re: XML::LibXML - WHAR HASH TREES WHAR?!	Re:SoPW
2011-06-25	choroba	Re: How do I ignore comments in an xml file when using win32::ole?	Re:SoPW
2011-06-07	choroba	Re: Group XML	Re:SoPW
2011-05-10	choroba	Re: Replace MathML content using Twig	Re:SoPW
2011-05-03	choroba	Re: Missing values in XML::Twig Output	Re:SoPW
2011-04-21	choroba	Re: How do I get LibXML to replace attribute values?	Re:SoPW
2011-04-12	choroba	Re: Changing XML Tag value in Perl Script	Re:SoPW
2011-04-12	choroba	Re: perl xpath extraction	Re:SoPW
2011-04-05	choroba	Re: Modify XML	Re:SoPW
2011-03-25	choroba	Re: regex replace using position loop	Re:SoPW
2010-12-03	choroba	Re: XPathing Up level	Re:SoPW
2010-11-11	choroba	Re: XML::XPath - node-to-xpath reverse lookup	Re:SoPW
2010-10-29	choroba	Re: XPath command line utility...	Re:SoPW
2010-10-26	choroba	Re: Perl and Lib::XML usage	Re:SoPW
2010-10-22	choroba	Re: Having problems accessing individual attributes in xml	Re:SoPW
2010-09-14	choroba	Re: Deleting XML element using XML::LibXML	Re:SoPW
2010-08-12	choroba	Re: Sort xml based on attribute	Re:SoPW
2010-08-05	choroba	Re: How I can change value in XML file ?	Re:SoPW

2007-03-05	merlyn	Re: XML gurus unite!!	Re:SoPW
2007-02-13	merlyn	Get most recently refreshed CPAN mirror in your country	Snippet

2005-10-13	saintmike	Re: html analysis tool via regex	Re:SoPW
2005-08-22	merlyn	Re: Looking for a XPATH-like tool for HTML documents	Re:SoPW
2005-05-20	rg0now	Re^3: xQuery functionality in Perl?	Re:SoPW
2005-05-11	rg0now	Re: XML Parsing Suggestions?	Re:SoPW
2005-05-08	merlyn	Re: Replacing everything in between using s///;	Re:SoPW
2005-04-08	rg0now	Re: Parsing XML/HTML	Re:SoPW
2005-03-21	merlyn	Re: XML::Simple "transforming data"	Re:SoPW

2004-08-26	merlyn	�Re: Delete from string through s///	Re:SoPW
2004-06-01	ambrus	ambrus's scratchpad	SPad
2004-04-12	merlyn	�Re: Just use an XSLT stylesheet	Re:Med
2003-10-27	princepawn	HTML Templating as Tree Rewriting: Part I: "If Statements"	Med
2003-10-22	merlyn	Screen-scraping using XSH - O'Reilly Animal lister	Code
2003-08-18	merlyn	�Re: XML::XPath	Re:SoPW