Some of you might have experienced it, too. You work with an XML document whose structure is complicated. Constructing XPath queries becomes boring at last, even in an interactive XML processing tool like
XML::XSH2. A diagram would help.
Here is a short xsh script that creates a dot source code. Just pipe its output to dot to get a nice graphical representation of the XML structure!
# xml2dot.xsh
# Author: E. Choroba 2013
open {$ARGV[0]} ;
my $count ;
def processNonRec $type $parent $node {
my $name = $type ;
perl { $name .= $count->{"*$type"}++ } ;
echo :s '"' $parent '"' " -> " '"' $name '"' ;
echo :s '"' $name '" [label="' $type '()"]' ;
}
def processNode $parent $node {
my $label = name($node);
my $name ;
perl { $name = $label."=".$count->{$label}++ } ;
echo :s '"' $parent '"' " -> " '"' $name '"' ;
if $node/self::* {
if (count($node/../*[name()=$label]) > 1) {
my $num = count($node/preceding-sibling::*[name()=$label])
+ ;
$label = concat($label,"[",$num+1,"]") ;
}
} else {
$label = concat("@",$label) ;
}
echo :s '"' $name '" [label="' $label '"]' ;
for ($node/*
| $node/@*
| $node/text()
| $node/comment()
| $node/processing-instruction()) {
if self::* processNode $name (.) ;
if count(.|../@*)=count(../@*) processNode $name (.) ;
if self::text() processNonRec 'text' $name (.) ;
if self::comment() processNonRec 'comment' $name (.) ;
if self::processing-instruction() processNonRec 'pi' $name (.)
+ ;
}
}
echo 'strict digraph' name() '{' ;
echo 'node [shape=box]' ;
processNode "document" /* ;
echo '}' ;
Example usage:
xsh -al xml2dot.xsh data.xml | dot -Tpng > data.png