perltutorial
jeffa
<p>[kobe://XML::Generator::DBI] is the replacement for
[kobe://DBIx::XML_RDB], which allowed you to convert a
database query into 'on the fly' XML. XML::Generator::DBI
does a lot more than just that, it generates SAX events
and relies upon a SAX handler to handle them in some manner. This
adds more complexity to the process, but it also adds more
flexibility, allowing you to do more than simply create a
string or XML file. This tutorial aims to showcase the
features of XML::Generator::DBI while providing concrete
examples of sample usage.</p>
<readmore>
<p>You can download each example individually, or download
them all via
[http://jeffa.perlmonk.org/xmlgen_tut/examples.tar.gz|tar ball]</p>
<hr />
<h2>Table of Contents</h2>
<ul>
<li><a href="#ex1">Example 1: DBI toXML</a></li>
<li><a href="#ex2">Example 2: Working with XPath</a></li>
<li><a href="#ex3">Example 3: DBI to XML to HTML</a></li>
<li><a href="#ex4">Example 4: DBI to XML to HTML (revisited)</a></li>
<li><a href="#ex5">Example 5: A Dynamic CGI Script</a></li>
<li><a href="#credits">Credits</a></li>
</ul>
<hr />
<h2><a name="ex1">Example 1: DBI to XML</a></h2>
<p>[download [http://jeffa.perlmonk.org/xmlgen_tut/example1|example 1]]</p>
<p>Say you want to simply convert a database query into an
XML document and your database table, <i>foo</i>, looks like this:</p>
<pre>
+-------+------+
| bar | baz |
+-------+------+
| one | 1 |
| two | 2 |
| three | 3 |
+-------+------+
</pre>
<p>The following code will convert the results of the SQL statement
'select bar,baz from foo' into an XML document,
<i>foo.xml</i>.</p>
<p>First we use the appropriate modules. XML::Generator::DBI
needs a SAX handler to create the file, a handler such as
[kobe://XML::Handler::YAWriter]:</p>
<code>
use strict;
use DBI;
use XML::Generator::DBI;
use XML::Handler::YAWriter;
</code>
<p>Next we connect to the database, you will need to replace
<i>vendor</i>, <i>database</i>, <i>host</i>, <i>user</i>, and
<i>pass</i>; with your credentials:</p>
<code>
my $dbh = DBI->connect(
qw(DBI:vendor:database:host user pass),
{RaiseError=>1},
);
</code>
<p>Next we instantiate the SAX handler with the <b>AsFile</b>
argument set the name of the file we wish to create:</p>
<code>
my $handler = XML::Handler::YAWriter->new(AsFile => 'foo.xml');
</code>
<p>Next we instantiate the generator. We pass our database
handle and our SAX handler to the constructor, and specify
that we want indentation turned on via the <b>Indent</b>
argument (if you know that humans will not need to see the
XML, then save some bytes and don't indent):</p>
<code>
my $generator = XML::Generator::DBI->new(
Handler => $handler,
dbh => $dbh,
Indent => 1,
);
</code>
<p>Finally, we execute our SQL query with the generator's
execute() method:</p>
<code>
$generator->execute('select bar,baz from foo');
</code>
<p>And that's it. Providing that we properly connected to the
database, had no errors in our SQL statement, and results
were actually returned from the query, the file
<i>foo.xml</i> will be created and it will contain
the query results wrapped in XML elements. Here is what my
results looked like:</p>
<pre>
<?xml version="1.0" encoding="UTF-8"?><database>
<select query="select bar,baz from foo">
<row>
<bar>one</bar>
<baz>1</baz>
</row>
<row>
<bar>two</bar>
<baz>2</baz>
</row>
<row>
<bar>three</bar>
<baz>3</baz>
</row>
</select>
</database>
</pre>
<p>There is a lot of magic going on behind the scenes. Here
is an illustration of the pipeline through which the data
flows and is transformed:</p>
<pre>
XML::Generator::DBI
-> XML::Handler::YAWriter
</pre>
<p>XML::Generator::DBI fetches the query results from the
database and generates SAX events. These SAX events are
handled by XML::Handler::YAWriter, which writes them out to
the appropriate destination as XML elements.</p>
<hr />
<h2><a name="ex2">Example 2: Working with XPath</a></h2>
<p>[download [http://jeffa.perlmonk.org/xmlgen_tut/example2|example 2]]</p>
<p>A powerful feature of DBI::Generator::XML is it's ability
to work with any SAX Level 1 handler. Instead of writing to
file, you can instead pass the XML to a handler such as
[kobe://XML::XPath]'s XML::XPath::Builder for further
processing. Let's give it a try, first we import the
necessary modules:</p>
<code>
use strict;
use DBI;
use XML::Generator::DBI;
use XML::XPath;
use XML::XPath::Builder;
</code>
<p>Next, we connect to the database and fetch a handle:</p>
<code>
my $dbh = DBI->connect(
qw(DBI:vendor:database:host user pass),
{RaiseError=>1},
);
</code>
<p>Next, we instantiate our SAX handler:</p>
<code>
my $handler = XML::XPath::Builder->new();
</code>
<p>Next, we instatiate the generator and pass it the handler
(and notice that Indenting is not turned on -
XML::XPath::Builder does not need it):</p>
<code>
my $generator = XML::Generator::DBI->new(
Handler => $handler,
dbh => $dbh,
);
</code>
<p>Next, we call the <i>execute()</i> method from the
generator - just like Example 1, but this time we actually
receive an l-value from the <i>generator()</i> method,
an XML::XPath::Node::Element object:</p>
<code>
my $xp = $generator->execute('select bar,baz from foo');
</code>
<p>Finally, we use this object to obtain a list of the nodes
we want (consult the docs for XML::XPath and
XML::XPath::Builder for more info on those modules):</p>
<code>
my $nodeset = $xp->find('/database/select/row/bar');
print $_->string_value, "\n" for $nodeset->get_nodelist;
</code>
<p>This will yield the following output (to STDOUT):</p>
<pre>
one
two
three
</pre>
<p>Our pipeline for this example looks like:</p>
<pre>
XML::Generator::DBI
-> XML::XPath::Builder
</pre>
<p>Now, of course we could have achieved roughly the same
results with a simple database query, but the point of this
example is to show that XML::Generator::DBI will work with
any SAX1 handler. It even works with SAX2 handlers via
[kobe://XML::Filter::SAX1toSAX2], as we shall see in the next
example.</p>
<hr />
<h2><a name="ex3">Example 3: DBI to XML to HTML</a></h2>
<p>[download [http://jeffa.perlmonk.org/xmlgen_tut/example3|example 3]]</p>
<p>Let's try a SAX2 handler such as
[kobe://XML::Handler::HTMLWriter], which transforms SAX
events into HTML elements. In order for XML::Generator::DBI's
SAX1 events to be processed by XML::Handler::HTMLWriter, they
will have to first be converted to SAX2 events. This is
accomplished with [kobe://XML::Filter::SAX1toSAX2]. Let's
take a look at the pipeline first:</p>
<pre>
XML::Generator::DBI
-> XML::Filter::SAX1toSAX2
-> XML::Handler::HTMLWriter
</pre>
<p>In a moment, you will see that we instantiate the
XML::Handler::HTMLWriter object, attach it to an
XML::Filter::SAX1toSAX2 object, and pass the filter to
XML::Generator::DBI. This chain of events might give the
incorrect impression that XML::Handler::HTMLWriter is SAX1
and XML::Generator::DBI is SAX2 (it had me confused at
first). Visualize the flow of events like the pipeline
illustrates, not in the order that the objects are
instantiated.</p>
<p>First, we import the necessary modules, and connect to the
database:</p>
<code>
use strict;
use DBI;
use XML::Generator::DBI;
use XML::Filter::SAX1toSAX2;
use XML::Handler::HTMLWriter;
my $dbh = DBI->connect(
qw(DBI:vendor:database:host user pass),
{RaiseError=>1},
);
</code>
<p>Next, we instatiate an XML::Handler::HTMLWriter handler,
which outputs to STDOUT by default:</p>
<code>
my $handler = XML::Handler::HTMLWriter->new();
</code>
<p>Next, we will need to filter this handler:</p>
<code>
my $filter = XML::Filter::SAX1toSAX2->new(Handler => $handler);
</code>
<p>Next, a trick - XML::Handler::HTMLWriter expects the first
element it encounters to be <html>, but
XML::Generator::DBI's first element is <database>. No
problem - we can override the name of the first element with
the <strong>RootElement</strong> argument. The next pitfall
is that instead of our tabular XML data being wrapped in
<table> elements, they are wrapped in <select>
elements - we can override this with the
<strong>QueryElement</strong> argument. Yet another pitfall
is that rows are wrapped in <row> elements, not
<tr> elements - this is remedied by overriing the
<strong>RowElement</strong> argument. And finally, the last
pitfall is that the elements that make up each column name
are the column names, not <td>. XML::Generator::DBI
does have a <strong>ColumnElement</strong>, but we can't
touch it. We can, however, change the name of each column
name within SQL, via the AS keyword. Yes, this really
works:</p>
<code>
my $generator = XML::Generator::DBI->new(
Handler => $filter,
dbh => $dbh,
RootElement => 'html',
QueryElement => 'table',
RowElement => 'tr',
Indent => 1,
);
$generator->execute('select foo as td,bar as td from baz');
</code>
<p>And the results are:</p>
<pre>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<table query="select foo as td,bar as td from baz">
<tr>
<td>foo</td>
<td>1</td>
</tr>
<tr>
<td>bar</td>
<td>2</td>
</tr>
<tr>
<td>baz</td>
<td>3</td>
</tr>
</table>
</html>
</pre>
<p>While this works, it is no doubt not the correct procedure
- there is no 'query' attribute for a <table> element.
Also, it would be nice to show the names of each column as
table headings (<th> tags). There has to be a better
way ...</p>
<hr />
<h2><a name="ex4">Example 4: DBI to XML to HTML (revisited)</a></h2>
<p>[download [http://jeffa.perlmonk.org/xmlgen_tut/example4|example 4], download [http://jeffa.perlmonk.org/xmlgen_tut/foo.xsl|foo.xsl]]</p>
<p>This time, we utilize the power of XSLT to transform an
XML document to an HTML document. Discussing XSLT is beyond
the scope of this review, but in a nutshell, XSL is a
language for expressing style sheets (XSLT stands for XSL
Transformations), and just so happens that it does a good job
of generating HTML from XML. You can read more about XSL and
XSLT at [http://www.w3.org/Style/XSL].</p>
<p>The SAX Level 2 [kobe://XML::Filter::XSLT] module can be
used with [kobe://XML::SAX::Writer] to apply a stylesheet to
the XML content. Our pipeline looks like this:</p>
<pre>
XML::Generator::DBI
-> XML::Filter::SAX1toSAX2
-> XML::Filter::XSLT
-> XML::SAX::Writer
</pre>
<p>Here we go. First, import the modules and connect to the database:</p>
<code>
use strict;
use DBI;
use XML::Generator::DBI;
use XML::Filter::SAX1toSAX2;
use XML::Filter::XSLT;
use XML::SAX::Writer;
my $dbh = DBI->connect(
qw(DBI:vendor:database:host user pass),
{RaiseError=>1},
);
</code>
<p>Next, create the pipeline:</p>
<code>
my $writer = XML::SAX::Writer->new();
my $xsl_filt = XML::Filter::XSLT->new(Handler => $writer);
my $sax_filt = XML::Filter::SAX1toSAX2->new(Handler => $xsl_filt);
my $generator = XML::Generator::DBI->new(
Handler => $sax_filt,
dbh => $dbh,
);
</code>
<p>Because XML::SAX::Writer is at the end of the chain, it
will override XML::Generator::DBI's indenting, so there is no
need to turn it on.</p>
<p>Finally, supply the stylesheet (given below) to the
XML::Filter::XSLT object and generate the output:</p>
<code>
$xsl_filt->set_stylesheet_uri('foo.xsl');
$generator->execute('select bar,baz from foo');
</code>
<p>XML::SAX::Writer currently does not indent it's output, so
i piped it through a Perl one-liner:</p>
<pre>
./example4.pl | perl -pe "s/></>\n</g"
<?xml version='1.0'?>
<html>
<body>
<table>
<tr>
<th>Bar</th>
<th>Baz</th>
</tr>
<tr>
<td>one</td>
<td>1</td>
</tr>
<tr>
<td>two</td>
<td>2</td>
</tr>
<tr>
<td>three</td>
<td>3</td>
</tr>
</table>
</body>
</html>
</pre>
<p>And finally, here is the XSL stylesheet - save this as the
file <strong>foo.xsl</strong>:</p>
<pre>
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<table>
<tr>
<th>Bar</th>
<th>Baz</th>
</tr>
<xsl:for-each select="database/select/row">
<tr>
<td><xsl:value-of select="bar"/></td>
<td><xsl:value-of select="baz"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
</pre>
<hr />
<h2><a name="ex5">Example 5: A Dynamic CGI Script</a></h2>
<p>[download [http://jeffa.perlmonk.org/xmlgen_tut/example5|example 5], download [http://jeffa.perlmonk.org/xmlgen_tut/dynamic.xsl|dynamic.xsl]]</p>
<p>Example 4 presented XSLT, but did so in fairly inflexible
manner. The stylesheet used 'hard-coded' values: not only did
we hard code the names of each row, we also hard coded the
table headings. What if we don't know ahead of time what the
names of the database columns will be? Is there a way
abstract this information? You bet - it's called XPath, which
was used in the stylesheet from example 4. But this time we
use the power of XPath's * wildcard and <i>text()</i>
function to accept any database column name. In order to
provide dynamic column headers, we will utilize the
XML::Generator::DBI constructor's
<strong>ShowColumns</strong> argument.</p>
<p>Let's first take a peak at what the XML looks like for our
sample database when we specify the
<strong>ShowColumns</strong> argument:</p>
<pre>
<?xml version="1.0" encoding="UTF-8"?><database>
<select query="select bar,baz from foo">
<columns>
<column>
<name>bar</name>
<type>varchar</type>
<size>255</size>
<precision>32</precision>
<scale>0</scale>
<nullable>NULL</nullable>
</column>
<column>
<name>baz</name>
<type>integer</type>
<size>10</size>
<precision>8</precision>
<scale>0</scale>
<nullable>NULL</nullable>
</column>
</columns>
<row>
<bar>one</bar>
<baz>1</baz>
</row>
..... etc.
</pre>
<p>With this information we can dynamically build our column
headings for our table. And here is the new stylesheet,
<i>dynamic.xsl</i>. It very similar to the last
one, but will work with any SQL select query:</p>
<pre>
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<table>
<tr>
</pre>
<p>This time, instead of hard coding the field names we loop
through the <name> elements provided by
XML::Generator::DBI. We obtain the contents of those elements
with the <i>text()</i> function:</p>
<pre>
<xsl:for-each select="database/select/columns/column/name">
<th><xsl:value-of select="text()"/></th>
</xsl:for-each>
</tr>
<xsl:for-each select="database/select/row">
<tr>
</pre>
<p>Here, instead of relying upon hard-coded field names, we
utiltize the * wildcard to match any element, and obtain it's
content again with <i>text()</i> function:</p>
<pre>
<xsl:for-each select="*">
<td><xsl:value-of select="text()"/></td>
</xsl:for-each>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
</pre>
<p>For our CGI script we will offer the end user a checkbox
group of columns from a database table we designate. We could
let them chose the table and the columns, but this way is
more secure. We will still have to make sure that column
names the end user supplies provides are valid, as a web bot
does not need to use our interface to interact with the
script. I chose to use my MP3 collection database, feel free
to use whatever database you desire. First we import the
necessary modules:</p>
<code>
use strict;
use DBI;
use CGI qw(:standard);
use CGI::Carp qw(fatalsToBrowser);
use XML::SAX::Writer;
use XML::Generator::DBI;
use XML::Filter::SAX1toSAX2;
use XML::Filter::XSLT;
</code>
<p>Next, we specify the name of our stylesheet and the
columns we will allow the end user to select:</p>
<code>
my $style = 'dynamic.xsl';
my %field = (
name => 'artist.name as Artist',
album => 'album.title as Album',
title => 'song.title as Song',
year => 'album.year as Year',
);
</code>
<p>Next, we print the header and our form:</p>
<code>
print header, start_html, start_form,
checkbox_group(
-name => 'fields',
-values => [ keys %field ],
),
submit('go'), end_form,
;
</code>
<p>Next, we check to see if the user has submitted the
form:</p>
<code>
if (param('go')) {
</code>
<p>The next line only accepts those column names that we are
allowing the user to select. If the user doesn't select any
or provides invalid column names, then the array @ok will be
empty:</p>
<code>
my @ok = map { $field{$_} } grep $field{$_}, param('fields');
</code>
<p>If @ok is empty, we issue a <strong>die</strong>, which
will be intercepted by CGI::Carp. Otherwise we continue by
joining the valid column names with commas:</p>
<code>
die 'no valid fields selected' unless @ok;
my $select = join(',', @ok);
</code>
<p>Connect to the database. This time i specify the database
<i>mp3</i>. Your milleage will vary:</p>
<code>
my $dbh = DBI->connect(
qw(DBI:vendor:mp3:host user pass),
{RaiseError=>1},
);
</code>
<p>Next, create the pipeline - notice that
<strong>ShowColumns</strong> is turned on:</p>
<code>
my $writer = XML::SAX::Writer->new();
my $xsl_filt = XML::Filter::XSLT->new(Handler => $writer);
my $sax_filt = XML::Filter::SAX1toSAX2->new(Handler => $xsl_filt);
my $generator = XML::Generator::DBI->new(
Handler => $sax_filt,
dbh => $dbh,
ShowColumns => 1,
);
</code>
<p>Next, check that the stylesheet exists and is readable by
the web server and die if it is not. If all is well, process
the stylesheet:</p>
<code>
die "could not open file $style" unless -r $style;
$xsl_filt->set_stylesheet_uri($style);
</code>
<p>Finally, execute our SQL statement and send the results to
the browser:</p>
<code>
$generator->execute("
select $select
from song
inner join album on song.album_id=album.id
inner join artist on album.artist_id=artist.id
order by artist.name,album.year,album.title
");
}
</code>
<hr />
<h2><a id="credits" name="credits">Credits</a></h2>
<p>Big thanks to author of XML::Generator::DBI, Matt Sergeant
([Matts]). Without his help, guidance, corrections and
patches this tutorial could not have been written. Thanks
once again Matt. :)</p>
<hr />
<h2><a id="bugs" name="bugs">Bugs</a></h2>
<p>While working with example 5, i discovered that CGI::Carp
did not play well with XML::SAX::Writer. I found that i had
older versions of both modules, so i installed the latest
versions for both. This fixed the CGI::Carp issue, but it
also broke example 3 - XML::Handler::HTMLWriter inherits from
XML::SAX::Writer and tries to call XML::SAX::Writer's
<i>start_document()</i> and <i>end_document()</i>
methods, which no longer exist in v0.41. These methods do
exist in v0.39, however, so i installed XML::SAX::Writer
v0.39 and both examples 3 and 5 worked for me. This leads me to suspect that you will need to install the latest CGI::Carp
in order for example 5 to work with it.</p>
<p>Also, example 5 has a couple of known bugs: an XML header
is printed when one is not needed. Also, opening and close
<html> and <body> tags are generated when they
too are not needed. Either i am missing an API call or two,
or the funtionality to omit these has not yet been
implemented. While most (if not all) browsers will accept
this, it surely is not valid XHTML. Also, i do not know of a
way to attach a Cascading Style Sheet to the final example -
this would be nice. Please /msg [jeffa] or post a reply if
you have any information about these issues.</p>
<p>(Hopefully, this entire 'Bugs' secion will disappear
soon.)</p>
<hr />
<h2><a id="todo" name="todo">TODO</a></h2>
<p>Do you have a favorite SAX handler that you would like to
include? Then this is your chance to add to this tutorial.
Feel free to post additional examples with
XML::Generator::DBI.</p>