|
|
CTO ArticlesIT World Code is data, and data is codeBy Sean Mc Grath I like to give names to the concepts I deal with in IT. As soon as a concept enters my head, the race is on for the correct word or phrase to express it. I don't like it when I cannot give things names. I get grumpy. Sometimes, I have visions of the ghost of Ludvig von Wittgenstein, feeding the birds outside his cottage in Renvyle[1] just South of here. In these visions, he is laughing at my naming problems. He is singing: It's all just code and data. Yes, it is true. At the moment I'm having trouble deciding what is code and what is data. It happens to me about once a year. It passes without medication if I get some rest and light exercise. Having words for things in IT is a two edged sword. It is in the nature of human language that most words have multiple meanings and that the ultimate meaning is determined inside a brain, in private. One person's code can be another person's data. The ghost of Wittgenstein rings in my ears: It's all just code and data. Sometimes I think it is a wonder we manage to converse at all in IT. Our field is one with more than its fair share of people I call 'dyadic generalists'. I'm one of them. We dyadic generalists like to do two things. We like to generalize and we like to split things into two opposing camps. RAM/ROM, software/hardware, documents/data, programming language/scripting language, client/server and so on. High on that list of juxtapositions is code/data. At least, I used to think it was a juxtaposition. These days, I'm not so sure. The evolution of the Web is a nice microcosm of the tension between code and data and will hopefully serve to illustrate what I'm talking about. In the beginning there was HTML. A form of data. Then CGI programs flourished that generated HTML. These programs took the form of code with embedded data. Then HTML browsers developed the ability to execute code in the form of Javascript. As a result, HTML became data with embedded code. Then CGI went of fashion, supplanted by technologies like ASP, JSP and PHP - all examples of data with embedded code. The latest twist in this merry dance for preeminence between code and data is the recent focus on techniques such as Apache Struts[2], Zope's TAL[3] and content management frameworks such as JPublish[4], all striving to cleanly separate the code from the data. We have been here before or course in the war for supremacy between code and data. Ever since user interfaces advanced beyond the patch-panel, we have struggled with how best to separate the melange of code and data that together conspire to construct applications that have user interfaces. Do you remember the heated debates about the merits/demerits of embedding SQL in Cobol? The same debate rages about embedding HTML/XML in Java. Remember the Model, View, Controller techniques pioneered when Smalltalk[5] roamed the land? Echoes of Apache Struts? It would appear that when it comes to organizing the relationship between code and data, the right answer does not live at the extremities of the spectrum. Code in data has problems - an entire shopping basket application in a single JSP page. Ugh! Data in code has problems - an entire shopping basket application constructed with print statements in Perl. Ugh! Now, as any self-respecting dyadic generalist will point out, one of two things can happen here. Either we find a middle ground between code and data that everyone can live with. Or, we fundamentally revisit the problem. I think we need to revisit the problem. In my mind's eye, I see the ghost of Wittgenstein again, well fed birds on each shoulder, smiling. He says: Code and data? It's all just *text*. Ah. Interesting. Perhaps I should have paid more attention to Poyla's intuition[6] that the more general problem may be easier to solve. Let's generalize code and data to be merely specific examples of text. Does that help? Well, the main reason for separating code and data is to better *manage* each. We live in a world in which, from a development perspective, code is code and data is data. Code lives in source code control systems, data lives in databases or XML documents. East is East and West is West. The twain ideally would not meet until deployment time. Perhaps this is how we should revisit the problem - by revisiting the very notion of *text* in our computer systems. What if every text editor on the planet was a folding text editor[7] that could seamlessly transclude[8] text from one location into another? With such a capability we could manage code and data separately, but by simply opening up a different 'view' on them, see them as a merged entity consisting of both code and data. Best of both worlds? Maybe I'm for the birds, but I think that just might be worth pursuing.
[1] http://www.connemara-tourism.org/regions/renvyle.html
|