Published in IT World
Deleterious guesswork in application integration
By Sean Mc Grath
You create a web page with a mixture of English and Cyrillic characters. You fire it up in your browser and it displays correctly. You do not know exactly why it works and you sure do not have time to figure it out. It works. Move on.
You create a blog entry in a rush. You mess up some of the markup. However, it displays correctly and the aggregators seems to manage to display it okay. You do not know exactly why it works and you sure do not have time to figure it out. It works. Move on.
You create a spreadsheet that contains a graph. One of the cells that should be a dollar amount contains a meaningless string. The graph seems to work regardless. You do not know exactly why it works and you sure do not have time to figure it out. It works. Move on.
Here is the not-so-big news folks:
1) Everybody is waaaay too busy to get everything right all of the time.
2) Software bends over backwards to be helpful, up to and including fixing your mistakes for you. Software consumers by and large, like software that can do that.
"Fixing your mistakes for you." Let us chew over the gristly bits of that statement for a moment.
On one hand it sounds great that software can pick up data in a less-than-perfect form and yet work with it. Software companies often differentiate their offerings based on how well they work in adverse data conditions. As consumers, we love this kind of thing. We love it when software "just works", probably because of all the times we have to cope with it not working at all. We delight it in. Software vendors use this delight features heavily in their sales strategies.
On the other hand, things get rather complex when you are joining together multiple applications, each of which is silently massaging incorrect data into correct data. The classic scenario where this manifests itself is in Enterprise Application Integration. Given a data problem in an EAI workflow, it becomes very complex to diagnose the origin of a data problem when all the disparate applications involved are "helping" you by fixing up bad data.
A mantra often used in application development is known as Postel's Law which states:
"Be liberal in what you accept and conservative in what you send."
Debate about the universal applicability of this law is ongoing at the moment. Both the negatives and positives of the law can be seen in the detritus of the browser wars.
Browser developers bent over backwards to make sense of badly mangled HTML. Browsers would never beep, never admonish the user, never throw a document to the floor, no matter how scatological. As end users, we liked that.
A downside was the guess-work wars that broke out with browser developers creating bigger and bigger browsers to cater for all the mangled data that would have to interpret. A bigger downside of smart applications was that web data ceased to stand on its own merits. Web data only made sense when viewed through the digital eyes of a particular browser. The errors in the data would then manifest themselves when the source data find its way into a different application in an EAI scenario. For example, a spidering application, a scraping application or a different browser.
Today the debate surrounds XML more than HTML. Is it okay for an application to "help" you by interpreting mangled XML for you rather than throwing the document to the floor? With more and more XML sloshing around EAI data flows, this is obviously a crucial question in E-Business applications.
Let me paint two scenarios for you.
First scenario: Imagine XML that is part of a web document to be interpreted by a browser. If the XML gets fixed up on the way does it really matter? How is it different from HTML getting fixed up on the way?
Second scenario: The XML is a message format coming from your heart monitor. It is fed into the machine that regulates your blood pressure medication. Is it okay for the XML coming out of the heart monitor to be silently fixed up by the blood pressure medicator?
I think you will agree that scenarios one and two are very different. My belief is that Postel's Law cannot be applied without understanding the context of its application. I am all in favour of smart software but I am also all in favour of knowing exactly what is doing guesswork and where it is doing it.
I would suggest a powerful middle ground would be for all applications that "help" you by interpreting mangled data provide an off-switch so that you can disable the functionality in situations where it would prove deleterious to the heath of your application.