Don't use XML::Simple!
perl -e "use Data::Dumper; use XML::Rules; print Dumper(XML::Rules::inferRulesFromExample( 'c:\temp\inventors.xml')) prints:
$VAR1 = {
'inventors' => 'no content',
'number' => 'as is',
'inventor' => 'as array no content',
'city,country,name,upper-name' => 'content'
};
With rules like this
XML::Rules would produce a data structure like this:
{
'inventors' => {
'inventor' => [
{
'country' => 'GB',
'city' => 'Aston Clinton',
'upper-name' => 'ANDY BARTH',
'number' => {
'_content' => '1',
'type' => 'integer'
},
'name' => 'Andy Barth'
},
{
'country' => 'GB',
'city' => 'Aylesbury',
'upper-name' => 'DANIELE DALL\'ACQUA'
+,
'number' => {
'_content' => '2',
'type' => 'integer'
},
'name' => 'Daniele Dall\'Acqua'
},
{
'country' => 'GB',
'city' => 'Calne',
'upper-name' => 'NIGEL DREW',
'number' => {
'_content' => '3',
'type' => 'integer'
},
'name' => 'Nigel Drew'
}
],
'type' => 'array'
}
};
Now I do not care about the
'type' => 'integer', I'd rather get just the content for the <number> as well, so let's change the rule for the tag to 'content'. This changes the structure to
{
'inventors' => {
'inventor' => [
{
'country' => 'GB',
'city' => 'Aston Clinton',
'upper-name' => 'ANDY BARTH',
'number' => '1',
'name' => 'Andy Barth'
},
{
'country' => 'GB',
'city' => 'Aylesbury',
'upper-name' => 'DANIELE DALL\'ACQUA'
+,
'number' => '2',
'name' => 'Daniele Dall\'Acqua'
},
{
'country' => 'GB',
'city' => 'Calne',
'upper-name' => 'NIGEL DREW',
'number' => '3',
'name' => 'Nigel Drew'
}
],
'type' => 'array'
}
};
Better, but I can do even better. If I know I want to get the inventors by number I can change the rule for the <inventor> tag to 'by number' and get a hash instead of an array:
{
'inventors' => {
'1' => {
'country' => 'GB',
'city' => 'Aston Clinton',
'upper-name' => 'ANDY BARTH',
'name' => 'Andy Barth'
},
'3' => {
'country' => 'GB',
'city' => 'Calne',
'upper-name' => 'NIGEL DREW',
'name' => 'Nigel Drew'
},
'type' => 'array',
'2' => {
'country' => 'GB',
'city' => 'Aylesbury',
'upper-name' => 'DANIELE DALL\'ACQUA',
'name' => 'Daniele Dall\'Acqua'
}
}
};
In which case getting the name of the inventor #1 would be just
$data->{inventors}{1}{name}. If the XML contains just the inventors I can get rid of the 'inventors' by changing it's rule to 'pass' and it'd be just
$data->{1}{name}. I also do not want the
'type' => 'array' so let's add " remove(type)" to the rule for <inventors>.
use Data::Dumper;
use XML::Rules;
my $parser = XML::Rules->new(
stripspaces => 7,
rules => {
'inventors' => 'no content remove(type)',
'inventor' => 'by number',
'number,city,country,name,upper-name' => 'content'
}
);
my $data = $parser->parsefile('c:\temp\inventors.xml');
#print Dumper($data);
print "The 1st inventor was $data->{inventors}{1}{name}\n";
Jenda
Enoch was right!
Enjoy the last years of Rome.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.