Simple Complexity and Complex Simplicity

The Role of SVG and Declarative Presentation
in a Post-Imperative World

By Kurt Cagle

Metaphorical Web Publishing

http://www.metaphoricalWeb.com

 

 

The Move Toward Simple Complexity

Innovation seldom comes at the top of an economic cycle. This may seem paradoxical at first – we’ve just been through one of the most expansive growths in computer technology, and it would seem ludicrous to say that there was no “innovating” going on at the time. However, in general what was happening during the boom years of 1996-2000 was not in fact the innovation of new concepts, but the culmination of efforts and initiatives that actually began when the economy was in the doldrums.

In the late 1980s, the initial network layer of personal computers was created, resulting in a temporary boom in applications as the concept of locally networked nodes began to permeate business. In addition to making it possible to share information (and to a certain extent resources) between applications, the LAN also engendered the rise of desktop publishing, which made it possible to disseminate “processed” information much more efficiently. This market essentially pushed these technologies into business until about 1990, at which point the paradigm had reached a point where other technologies needed to exist for it to grow.

The Internet was of course that technology. The creation of a simple protocol for connecting networks across heterogeneous boundaries (TCP/IP), coupled with a simple protocol for communicating across this network (HTTP) and a simple protocol for creating a common presentation layer across that network (HTML) completely redefined the landscape of society. TCP/IP is simpler than the myriad other network protocols (Netware IPX, Banyan, etc.) and at the time of its invention was considered to be too simple to have much use within the business space, which is typically seen by business programmers and consultants as being far more complex than it really is.

Instead, the use of HTTP over TCP/IP exploded. HTML, from being a cobbled together bit of pseudo-SGML, a language that perhaps a few hundred people were aware of, emerged to become the dominant programming language on the planet. Many traditional programmers derided HTML because it didn’t really do much – rather it described what was to be displayed, and as such was a descendent of the typesetter’s art more than it was the proper language for building an operating system.

Much of the “innovation” that fueled the rise of the Internet was based more upon business processes that in fact “complexified” (to coin a word, or at least to highly abuse the English language) much of the innate operating system that was hinted at when college students were playing with Mosaic in 1992-3 and heralding a new “community”. The CGI interface proposed by Berners-Lee and others provided a set of standard protocols for creating a simple RPC architecture that could be used in conjunction with the URL addressing scheme associated with HTTP to make computational requests and retrieve content in the form of HTML as well as binary content (images, sounds, etc.).

The importance of this standard was that it essentially abstracted away the characteristics of the other side of the transaction between client and server. The client would define a location on the web, then would either append a query string to this address to pass parameters for manipulation or would transmit a message of text as a series of name/value pairs. This abstraction is not insignificant in its importance – it means that what exists on the other end of that pipe is irrelevant, so long as you know the parameters to use in the invocation and the kind of information you expect to get back.

TCP/IP forced Banyan out of business and Novell to specialize in a new vertical. HTTP quickly became such a force that my three-year-old daughter knows that any place on the web is found by invoking those four magic letters – and my ten year old daughter knows that all of the cool games and graphics can be accessed across this network rather than dealing with awkward CD-ROMs that become tiresome after a couple of runs. HTML, as simple a protocol as it is, halted the burgeoning multimedia world in its tracks. Simplicity, commonality, open-ness – these three factors seem to be the decisive factors for determining whether or not a technology will gain a foothold in the often nasty commercial politics of this industry.

HTML is not as simple now as it was back in 1994. Indeed, a solid reference on HTML can run to more than 1000 pages, especially when customized DOM, proprietary extensions, and “alternative markup” gets incorporated into the mix. However, much of this complexity emerged as efforts were made on the part of imperative programmers (and vendors) to find a way of moving beyond be and into do.

This conflict lies at the crux of programming, and will likely be the foundation upon which struggles for the direction of programming will be made. Declarative programming is, in essence, a description of state. An HTML document, for instance, provides the initial state of a web page. This state is in term provided to a specialized application – a user agent, a.k.a. a browser, to interpret that document in a standard, familiar form.

I’m going to avoid the word browser from here on out for several reasons, mostly having to do with the connotations of the word. A browser implies static content – we browse newspapers or TV channels, yet our ability to interact with these usually is limited to navigating through the content by changing pages or flipping channels, or by turning the devices on or off. We have relatively little ability to affect the content of the articles or TV shows directly. However, as this paper discusses, we are moving to a point where declarative architectures can in fact create dynamic, controllable environments. Such declarative interactivity means that our user agents become much more active representations, and the passive implications of “browsers” can actually limit for us a proper discussion about the range of possibilities that such technologies may enable.

If declarative programming is associated with the verb “be” then imperative programming is associated with the verb “do”. The word Imperative of course shares its origins with Emperor and imperial, implying “command”. In imperative programming, actions are accomplished via a set of commands – create this window, display it, when this item on the window is clicked, close it. The behavior is explicitly spelled out through these sequences of commands as actions to be undertaken one after another, until the process completes.

Imperative models are consequently task oriented – a particular application is designed such that you define an initial document state, then apply a series of operations to that state that evince a transformations. This type of architecture can be seen within sophisticated applications such as Photoshop – in general, the application itself maintains a base state, a set of actions upon that state (each stroke of a paint brush, for instance), and a current manifestation of that state as a rendered buffer. Most contemporary applications work in this manner – they are non-destructive because at any point within their operation they have an audit trail of all actions accomplished to that point.

Model/View/Controller Architectures

Consequently, for an application like Photoshop, the actions can readily break down into a model (the initial state of the graphic and a library of Photoshop’s set of actions upon that initial graphic), a rendered view that illustrates the most recent transformation acting upon that graphic, and the controller mechanism that translates the model into the view. This Model/View/Controller architecture is well known in programming circles, though in practice it is not always that easy to implement in imperative languages.

Part of this difficulty arises from the need within an imperative model to build a complex library of procedure calls, either through a linear API or (especially of late) through a framework of classes. These classes often do not have an implicit relationship between them – and so have to be related explicitly. For instance, consider an instance where you wish to create a simple windowed application with a menu. Typically, in pseudo-code, this would like something like:

public class Form{

public Form(){

Initialize();

}

 

public void Initialize(){

this.setTitle(“App Processor”);

Menu fileMenu=new Menu(“File”);

fileMenu.Add(new MenuItem(“Open”,MenuFileOpen_Click));

fileMenu.Add(new MenuItem(“Save”, MenuFileSave_Click));

fileMenu.Add(new MenuItem(“Print”, MenuFilePrint_Click));

fileMenu.Add(new MenuItem(“Exit” ,MenuFileExit_Click));

this.pane.Add(fileMenu);

this.pane.Show();

}

 

public void Update(){

// Here’s the update code

}

 

public void Terminate(){

// Here’s the termination could

}

 

public MenuItemHandler MenuFileOpen_Click(Element source,Object [] args){

// Here’s the actions for opening the file

}

 

public MenuItemHandler MenuFileSave_Click(Element source,Object [] args){

// Here’s the actions for saving the file

}

 

public MenuItemHandler MenuFilePrint_Click(Element source,Object [] args){

// Here’s the actions for printing the file

}

 

public MenuItemHandler MenuFileExit_Click(Element source,Object [] args){

// Here’s the actions for exiting the application

}

 

 

public static Main(int ArgC,String[] ArgV){

Form app=new Form();

While (!app.endSignalReceived){

app.Update();

}

app.Terminate();

}

}

 

The structure of the model becomes intertwined with that of the view and the controller fairly early on, even in the simple case where the model is the form application itself. It is possible with some diligence to keep the model distinct from the other aspects of the application, but it is always a forced construction. Looking at the code, there is no obvious sense of containership or organization, though it can be ferreted out if you spend enough time with the structure.

 

There is one last point to consider in examining this code. The binding between the menu and the form here is very strong. In general, most applications create a set of a dozen or so such menus, including pop-up menus and subordinate pieces, that are largely fixed at design time. Additions may exist (for instance, to enumerate plug-ins) but this code is usually handled as another set of methods that likewise are also fixed just to handle the enumeration of one special case.

Here’s an analogous application, this time in XML format:

<application id=”app1”>

<form id=”form1”>

<title>App Processor</title>

<menus>

<menu id=”menuFile”

title=”_File”>

<menuItem id=”menuFileOpen”

title=”_Open”

accessKey=”ctl_O”/>

<menuItem id=”menuFileSave”

title=”_Save”

accessKey=”ctl_S”/>

<menuItem id=”menuFilePrint”

title=”_Print”

accessKey=”ctl_P”/>

<menuItem id=”menuFileExit”

title=”E_xit”

accessKey=”ctl_Q” >

</menuItem>

</menu>

</menus>

<openDialog id=”openDlg1” filter=”*.xml” title=”Open File” prompt=”Please open a file.”/>

<saveDialog id=”saveDlg1” filter=”*.xml” title=”Save File” prompt=”Save your file.”/>

<printDialog id=”printDlg1” title=”Print” prompt=”Print your file.”/>

<dataStore id=”dataFile” type=”Document”/>

<messageBox id=”messageBox1” position=”center”/>

</form>

<bindings>

<binding event=”menuFileOpen[@event=’click’]”

action=”openDlg1.show()”/>

<binding event=”menuFileSave[@event=’click’]”

action=”saveDlg1.show()”/>

<binding event=”menuFilePrint[@event=’click’]”

action=”printDlg1.print()”/>

<binding event=”menuFileExit[@event=’click’] and dataFile[@dirty=’yes’]” action=”dataFile1.save()”/>

<binding event=”menuFileExit[@event=’click’] and dataFile[@dirty=’no’]” action=”app1.exit()”/>

<binding event=”openDlg1[@event=’fileValid’]”

action=”dataStore.load(openDlg1/@filename)”/>

<binding event=”openDlg1[@error=’cancelled’]”/>

<binding event=”openDlg1[@error=’fileLocked’]”

action=”messageBox1.show(‘File is locked’)”/>

</bindings>

</application>

 


This “application” is mostly declarative, though it does have a certain amount of imperative code in it (a point which I’ll address in much greater detail in “Whither Imperative Code?”). There are a number of noteworthy features in this “program”.

First off, the structure of the application is immediately obvious. An application consists of at least one form, a form has a collection of menus, and each menu has a set of contained menu items. Additionally, a form can have a message box, several potential kinds of dialog boxes, and a dataStore (which violates the purely declarative nature of the code somewhat, to be discussed later). Each of these items are named, through their ids, and each have attributes (including some attributes that can be defaulted) appropriate for the given component.

In general the elements within the form are existential – they contain no obvious actions. Instead, the application has a separate bindings section, in which defined events on given objects produce demonstrable actions. The convention here may seem a little odd, but it is in fact quite powerful. In this application, I am making an assumption that every object’s current state can be represented as XML. When the controller passes an event on, the object raising the event will include an (implicit) event and error attribute on the element itself that can be queried. Thus, the open dialog may be passed as:

<openDialog href=”openDlg1”

event=”fileValid”

error=””

filename=”myFile.xml”/>

 

The reason for this somewhat inefficient operation is that it increases flexibility. The components can be abstracted as XML entities, queried using a common (perhaps universal) scheme, XPath, for addressing specific entities, and can still maintain a high degree of encapsulation – the XML presentation of the components do not have to represent the true internal state of such components, but only a view of that state suitable for external consumption. Other addressing schemes can be used of course, but Xpath is particularly appropriate for XML manipulation.

Significantly, there’s no explicit implementation here beyond the obvious connotations of the objects. With the XPath abstraction mechanism, the same XML could easily describe a console based application, a Java GUI implementation, a C# implementation, or a VB app, as most languages have the facility for implementing both component interfaces and underlying logic dynamically.

None of this is new, of course, but the implications for languages such as SVG are nonetheless profound, and to a great extent not fully appreciated by the vast majority of developers. For instance, consider the following:

The Explorer paradigm – a tree-view control that provides the ability to navigate through hierarchical data is attached to a separate pane which provides “context” for a node selected in the view, and may even include form content that needs to be saved as part of the model’s state. While XML is hierarchical, simply mapping an XML structure node by node to the tree usually ends up exposing either too much information or the wrong type of information at each node.

However, a specialized XML format – a TreeML presentation language -- could be defined that maps fairly closely to the component, in essence providing a way to model the internal state of the control. Once such a language exists, it is a fairly simple process to write an XSLT transformation that will map a particular schema (with an associated namespace) to the TreeML format. This means that, rather than “hard-wiring” the tree view within the application so that it only deals with a certain core type of information, the tree view can be used to represent ANY kind of information, so long as the Source XML -> TreeML XSLT transformation exists.

Each node in the source tree can consequently be associated with a specific action (or perhaps more appropriately, a specific URI) that contains a link to the relevant application form XML. This form could of course be an HTML page or an SVG document, but it could just as readily be a more sophisticated application document. Regardless of this, the forms so presented are visible in the right-hand pane. Menus may be added or removed based upon the contents of this right hand pane, or whether the context has shifted back to the left hand pane.

Put another way, by using XML as your application substrate, you move a significant portion of the core binary development into the creation of a highly flexible agent, rather than as a tightly constrained application. The imperative code still exists, but it gets pushed down the stack, serving more for implementing the controller or providing a generalized framework for the viewer that can then be invoked dynamically from XML.

XML is a means of encapsulating a document object model, and to a significant degree can also incorporate the programmatic rules that define the controller. However, there still needs to be some means of turning component descriptors into live components, that can be adapted to appropriate needs.

Namespaces, SVG and Other Low Level Languages

We are in version 2.8 of the Internet Operating System. Version 1.0 was almost completely text based, and was dominated by a whole host of primordial protocols, many of which had emerged from library science and similar domains – archie, veronica,gopher, and so forth. HTTP was in there too, mainly a part of an odd little application called lynx which made hypertext applications possible. Such applications, however, were just that – text.

Version 2.0 was the first attempt to turn that core operating system, operating largely via a command prompt, into a graphical system. It is probably not accidental that the graphical web emerged shortly after the emergence of Windows and other GUIs and the rise of multimedia; HTTP/HTML/CGI was effectively the GUI for this emergent web, made of ad hoc pieces and hotly contested standards.

The Version 3.0 web is clearly coming. As sketched out here, this kind of application is ultimately about agents – user agents, web service agents, trust agents, and so forth. A typical browser is actually a host for a number of such agents, most of them core XML (or XML-like) technologies. By moving closer to an XML-based MVC architecture, the core logic of the applications are distributed, while the agents themselves act as the interfaces, using XML languages that they understand.

Namespaces play a central part in this whole process. In the desktop era, namespaces were denoted by dot-suffices and type labels, establishing to which particular applications a given file was conformant. This wasn’t hard to do; there weren’t that many applications at first, and the builders of the operating systems could essentially grab the prime ones themselves.

The HTML era, drawing to a close, is demarcated by the MIME type. This information, contained within the headers of files sent back to the client from the server, provides a handle on both the intent of the file (application, graphics, sound, etc.) and the specific designation, as applied by a given standards body. It was this last caveat – the standards body, that has complicated things. As more and more information moves into XML, knowing that a document is of XML mime type is actually pretty worthless. You can indicate that it is a different mime-type corresponding to a specific application, but because mime types are very closely held and often require some major agreements upon different vendors for even the simplest mime-type designations, the MIME designators are becoming increasingly useless in a world where there are potentially millions of distinct XML schemas.

The namespace is the third generation attempt to solve this problem of document identification. The concept behind a namespace is simple … it is a unique label to be used in identifying a resource. It does not have to point to anything in particular in URL space – its primacy comes from the fact that it is a unique set of characters that can be associated with a specific schema.

Namespaces have a number of advantages over MIME-types. They are not specifically regulated through a single central authority. This means that an individual or company can establish namespaces specifically for local schemas without having to go through the standards bodies to get the namespace approved. It also means that increasingly, namespaces will serve as the interstitial links that identify not only the composition of an object from a specific schema, but its intrinsic functionality as well.

This principal is very definitely at play with Scalable Vector Graphics. Currently, most user agents still rely upon mime-types to determine the functionality of a given XML language, but this can prove problematic. For instance, the Mime type for SVG shifted from image/svg+xml to image/svg-xml during its development process, but certain viewers (such as the Adobe SVG Viewer) still use the older Mime type to designate that a document is an SVG document.

On the other hand, the namespace has been very much standardized since the SVG Recommendation was declared. Of course this leads to an interesting conundrum – the namespace is an attribute on the XML itself, which means that you need to parse the XML first in order to know what to do with it. By the time you know what the XML can do, you have it in a form to manipulate. This is actually quite useful – the DOM manipulation routines are essentially non-executing; they treat the XML generically, and so are not as prone to the kind of virus manipulation that pure executable files are.

SVG is a foundational language for the rendering of interfaces, though it needs to be beefed up a little to do the job properly. It serves the same purpose in the XML world that GDI does in the Windows world or the Java2D API does with Java. In general, while it can be used to create interfaces, such interfaces would be fairly primitive by themselves. However, SVG can be used to render the interface layers, while a layer higher up on the stack such as Xforms or our putative application XML above can handle the creation of interfaces for binding, possibly using XSLT or DOM to perform the requisite controller layer.

Other low level languages perform similar tasks, integrating other pieces of this API together. SMIL acts as a controlling mechanism for multimedia and interactivity, and is of course integrated into SVG. The XML Events protocol defines a standard methodology for passing events. XSLT plays a major role in mapping between data description namespaces and customized device control XML languages, and can also serve as a mechanism for performing routing and similar operations. Xlink, and correlative technologies such as RDF and RSS provide means of mapping these interfaces together.

Finally, XHTML is rapidly shifting from being a document description language to becoming a matrix substrate, a language in which other namespaces are embedded directly or indirectly. In this view XHTML loses most of it’s early abstract basis (i.e., losing the heading tags, the cite tag, even the emphasis tag) and instead becomes focused on integrating other namespaces in a modular fashion. For this reason, XHTML will not be going away any time soon.

Whither Imperative Code

Given the current emphasis on web services, Java and .NET, it seems rather foolhardy to say that these technologies are heading toward at least a less important position than they are now, but this is precisely what I see happening. XML is simpler than Java or C#. This may sound like its comparing apples and oranges, but that’s not really true.

XML has only one document object model – Java or C# both have hundreds if not thousands; this means that someone working with XML Dom can spend more time building object models and more time establishing functionality. XML can also spoof a binary set of objects, through Post-Schema-Validated Infosets, whereas it takes a lot of work for a set of Java classes to spoof XML.

XML can describe process, can in fact be compiled to produce the same kind of output that you see in imperative code in most cases – sometimes this is simpler, sometimes not, but because you are abstracting the model with XML, much of the hard work of accessing that model becomes far simpler.

XML is a modular technology; it is very easy to create namespaces that perform much of what might take dozens if not hundreds of classes to replicate. It has been my experience (though this is largely anecdotal) that as you work with most XML process schemas you will discover that things you had assumed would need to be done via procedural code can be done using intelligent XML design, and because these patterns are recurrent it is usually fairly simple to automate even this production. Moreover, this modularity neabs that you can have complex structures and yet still keep the actual processing down by decomposing the XML into smaller XML forms.

The distributed nature of XML additionally means that an application can in fact extend fractal-like into the Internet. This will also have a tendency to push imperative code usage down, so that they can be used primarily in areas where high performance is more critical than flexibility. XML also lends itself well to parallel processing; each processor could in essence act as an XML DOM or an XSLT transformation or a validator, simplifying the overall programming tasks.

Against this backdrop imperative languages will be pushed down the stack or assigned into roles to augment the XML, rather than the other way around. Innovations such as Microsoft’s .NET and certain of the J2EE initiatives (such as JSP Faces), pushing into other venues via open source equivalents, will augment this – as these technologies do move more into plumbing, the danger of framework explosion diminishes considerably, making it easier to claim true platform independences.

This all means that for XML languages, especially languages such as SVG, it is likely that once core development of the language stops (around 1.2/2.0) most of the rest of the work will end up moving into modular implementations with additional namespaces providing the support to encapsulate and simplify the access into the SVG rendering engine.