|Welcome to the Monastery|
Re: XML for databases?!?! Is it just me or is the rest of the world nutz?by mattr (Curate)
|on May 26, 2002 at 11:26 UTC||Need Help??|
to separate design from code from style in big web project.
to let you (maybe) easily do mobile interface in future for smaller project too
to handle changing hierarchical data structures
to quickly search tree-based documents with node-aware search paradigm (xpath) which does rock.
to maximize interoperability if you are sending lots of data to another party, i.e. data glue. E-commerce transactions made this kind of interchange format a holy grail some years ago.
to process ML-based data handed to you, including programming with strong tree metaphor.
to drop tabular data into an XML db you're stuck with..
to work with cognitive science relational/hierarchical semantic data like grammar trees (thinking of hypernym tree in Lingua::WordNet)
ditto, to work with data from cognitive science that can only be meaninfully be represented or accessed by in a tree-based paradigm, for example statements in predicate calculus in the OpenCyc AI project. The huge knowledge base is a morass of interrelated assertions which themselves are nested logical statements. Horrible, wonderfully neat stuff. See java xml api for it.
Yggdrasil for example is a neat-looking XML-based database, that is it is supposed to represent data internally as tree-structured data, which would make it very good for certain applications and bad for others. I wish I had a good problem that needed me to use it.. Actually I do have some hierarchical data but shallow enough to use serialized objects in ordinary object store.
As for data interoperability, consider genome processing, which seems to be the new benchmark for large projects with changing definitions of data that would otherwise drive you insane. A poster above mentioned use of XML in that case though at least for medium-sized projects. A different paradigm (BoulderIO, see bio.perl.org) seems to be popular which allows differently defined structured data sets to be processed in a pipeline system.
It would seem that implementing too much XML too deep in your system could be real bad unless everything is XML-based. But used as a way to share schemas, could be fantastic.
One thing I can say for sure is that I have seen some very slow XML processing systems. So display speed is a big issue for me. In particular I know of one server which uses XML to reformat HTML files for different browsers, which the developers are considering redeveloping in C++ since Java was too slow (or maybe incompetently developed, haven't seen the code myself). So you need to do a tradeoff, possibly. My guess is that initiatives like Sleepy Cat's will make those kind of products easier to develop.
The other thing is that you may have to spend a lot of time on interface and manuals if you are going to be handing XML tools to end-users, since their understanding of it and useability will be directly proportional to what they get out of it. I've written an introduction to xpath for end-users, which was not easy to do, and also seen the user interface and xpath search capabilities to be major competitive points in the software.