XML Databases and Related Comfort Food
By Sean McGrath
This might be good or bad depending on how you look at it: Human decision making -- even in the rarefied atmosphere of corporate level IT strategy - can be extremely unscientific. Supposedly, coldly analytical thinking and mathematically precise decision-making heavily populate the IT field. And yet, when you look closely, the technical decision-making process evidences plenty of tentative, diffident, even romantic reasoning.
The concept of a "database" exemplifies an area where decision-making can be less than rational. Just think, for the last 30 years, billions of dollars have been spent cutting complex information structures into shavings suitable for storing in stultifying restrictive systems known as relational databases. The cold mathematical logic of the relational database model is unassailable. It is TRUTH. We must OBEY....
Over the last thirty years, slavery to the relational data model has resulted in some bizarre contortions of business systems in order to fit the needs of the relational model. As the engineers were pulling their hair out to fit their businesses to the model, the marketers got cracking on the CTOs, CIOs, and CEOs. The word "database" became synonymous with words like "safe", "powerful", "managed", and so on. Databases became comfort food and brand name databases became capital expenditure nobody got fired for making.
How many databases, I wonder, were purchased over the last thirty years because somebody close enough to the chequebook liked the warm, fuzzy feeling the concept of a database provided? Quite a few I suspect.
On my travels, I come across many database systems that are unnecessarily complex and expensive because various brand names were purchased as comfort food and foisted upon a long suffering IT department. I also come across many database systems created by IT folk who substitute good design with a prayer to the SQL muse and the power of the dollar.
After thirty years, the relational database world is seeing its biggest ever threat on the top shelf of the comfort food aisle. Relational data is yesterday's news. Today, every self-respecting CEO, CIO, and CTO is reading marketing propaganda from XML vendors. If you have XML and you are doing serious commercial stuff with it, then you need to put it in a database right?
Maybe, maybe not.
Firstly, a lot of on-the-wire XML that results from e-commerce systems is transient data and the need for a fully blown repository of the stuff is questionable. For non-transient XML such as that used in document repositories, there are numerous ways to store the XML but only one of which is an "XML database" in the true sense of the phrase.
Storing XML in a database and storing XML in an XML database is very different. The former often means packing the XML into relational tables. The latter typically means that the database engine provides some sort of built-in support for the hierarchical structures that XML allows you to model. The former technique of packing the XML into tables sounds like a hack (and indeed it is) but is a very useful hack. Although the XML database vendors will tell you otherwise, sometimes it is all you need in an XML database.
So, resist the glossy allure of the XML database marketing sound bites and ask yourself: Do you really need an XML database or will a database to store XML do instead? For that matter, why not store the stuff on a file system? The last time I looked, both Windows and Unix had very powerful hierarchical storage systems build in. They are called "file systems".
Ask yourself what benefits any database will bring apart from a warm fuzzy feeling. There are use cases were the both-barrel, high dollar burn treatment of a native XML database is required but such applications are not as frequent as you might think and certainly not as frequent as some XML database vendors would like you to believe.
Sean is co-founder and Chief Technology Officer of Propylon and is an industry–recognised XML expert.