Applying sXBL to display Chemical Markup Language (CML)

For the SVG Open 2005 conference

Keywords: sXBL, Chemical Markup Language, CML, XML Binding, XUL, RDF, RSS, Compound Document

Johanne Jean-Baptiste
Software Developer
Ottawa
Ontario
Canada
johanne@svgscience.com

Biography

Johanne Jean-Baptiste studied in health sciences (biochemistry) and worked several years in the pharmaceutical industry. Johanne has broadened her knowledge base by pursuing studies in information systems and is currently a computer scientist at Statistics Canada. She is interested in the application of XML technologies in science.


Abstract


The draft SVG's XML Binding Language (sXBL) specification promises to improve the separation of concerns in rendering a variety of markup languages into Scalable Vector Graphics (SVG). The development and use of the Extensible Markup Language (XML) vocabularies specific to industries or fields is becoming more and more common. Displaying an XML language instance in a visual format is routinely done for end-user consumption. Viewing an XML instance or a portion of it in graphical format can be a desired feature or a requirement. In this research paper, the application of sXBL for the display of the scientific markup language for chemical data is investigated. The markup language chosen is CML (Chemical Markup Language). CML is used to describe relationships between atoms and the structure of chemical molecules. Examples already exist of the transformation of CML to SVG and the use of Mozilla's Extensible Binding Language (XBL) to display CML as SVG. A Resource Description Framework (RDF) format for CML has been proposed in order to better share chemistry information over the Internet as part on the Science Semantic Web effort. An XML User Interface Language (XUL) application was developed to build on the current XML technology building blocks: CML, SVG, RDF and XUL. The XUL CML Viewer displays metadata from a CML RDF file and the graphical representation of the molecules using sXBL. The aim of this research is to provide examples of applications of sXBL for rendering XML vocabularies and gather information on its applicability, pros and cons for CML.


Table of Contents


1. Chemical Markup Language (CML)
     1.1 CML schema
     1.2 CML and SVG Compound Document
2. SVG XML Binding Language (sXBL)
     2.1 sXBL interfaces and Binding class
3. CML Binding class
     3.1 CML sXBL Binding Definitions
          3.1.1 sXBL Definitions for cml:cml and cml:molecule
          3.1.2 sXBL Definitions for cml:atom
     3.2 Creation of the shadow tree using Ecmascript
          3.2.1 Ecmascript CML Binding Class
          3.2.2 xbl:content handling
4. Examples of CML displayed using sXBL
     4.1 Caffeine molecule
     4.2 Diazepam molecule
5. XML User Interface Language (XUL) CML Viewer
     5.1 CML, RSS and XUL
     5.2 Screenshot of XUL CML Viewer
6. Conclusion
Bibliography

1. Chemical Markup Language (CML)

Chemical Markup Language (CML), which has been in development since 2000, is used to describe chemical molecules, reactions, spectral data and other chemical information. For this paper CMLCore [CML specs] which mainly describe atoms and bonds was used. The latest version of CMLCore is 2.1.1. Since its inception, there have been several tools developed to show the molecule expressed in CML vocabulary as graphical atoms and bonds. Most of these are written in Java and allow 2D and 3D visualization of CML documents. There are also Java applets that have been developed to render CML on the client-side from a browser [CML wiki], [JMol applet], [Marvin viewer].

1.1 CML schema

The root element of a CML instance is cml. A major element of the CML schema is the molecule element presented below. It contains molecule properties (name, molecular weight,...) and its atoms and bonds.

molecule-diagram.png

Graph representation of the main element molecule (generated by Eclipse Web Tooling Platform plugins).

Figure 1: Schema graph of molecule element

1.2 CML and SVG Compound Document

Below is a CML document within an SVG one. In this example, the CML is placed within an SVG document for sXBL binding as described later. The binding definitions are in a separate file.

    <svg xmlns="http://www.w3.org/2000/svg" 
        xmlns:cml="http://www.xml-cml.org/schema/cml2/core" 
        xmlns:xbl="http://www.w3.org/2004/xbl" 
        onload="Initialize(evt)" height="600px" width="600px">
        
    <!-- skipped lines -->
    
    <cml:cml>
        <cml:molecule id="caffeine" title="caffeine">
            <scalar title="molecule weight">194.19</scalar>
            <scalar title="melting point" units="degC">238</scalar>
            <scalar title="specific gravity">1.23</scalar>
            <name convention="CAS">58-08-2</name>
            <!-- skipped lines -->
            <name>1,3,7-Trimethylxanthine</name>
            <name>3,7-dihydro-1,3,7-trimethyl-1H-Purine-2,6-dione</name>
            <cml:atomArray>
                <cml:atom id="a_1" elementType="C" x3="-2.8709" y3="-1.0499" z3="0.1718"/>
                <cml:atom id="a_2" elementType="N" x3="-2.9099" y3="0.2747"  z3="0.1062"/>
                <cml:atom id="a_3" elementType="C" x3="-1.8026" y3="0.9662"  z3="-0.1184"/>
                <!-- skipped lines -->
                <cml:atom id="a_11" elementType="O" x3="-1.8349" y3="2.1699" z3="-0.2205"/>
                <!-- skipped lines -->
                <cml:atom id="a_15" elementType="H" x3="2.3776" y3="-0.4481" z3="-0.6036"/>
            </cml:atomArray>
            <cml:bondArray>
                <cml:bond id="b1" atomRefs2="a_1 a_2" order="1" convention="MDL"/>
                <cml:bond id="b2" atomRefs2="a_1 a_6" order="1" convention="MDL"/>
                <cml:bond id="b3" atomRefs2="a_1 a_13" order="2"  convention="MDL"/>
                <!-- skipped lines -->
            </cml:bondArray>
        </cml:molecule>
    </cml:cml>
    
    <!-- skipped lines -->
    </svg>                                    
                

Example 1: SVG and CML compound document for the caffeine molecule

2. SVG XML Binding Language (sXBL)

The sXBL vocabulary is close to the Xml Binding Language developed by Mozilla. The Mozilla browser comes with binding classes that are accessed through the Cascading Stylesheet (CSS) of the document to be bound. However, in sXBL, the binding has not been specified to go through CSS. As there are currently no sXBL enabled browsers, for this paper, sXBL binding was implemented using ecmascript.

The Binding class was implemented according to the following class diagram derived from the sXBL specifications' interfaces.

2.1 sXBL interfaces and Binding class

ClassesXBL.png

sXBL binding class and its interfaces.

Figure 2: Bindings class diagram

3. CML Binding class

For CML binding, the elements below the root cml are instantiated into the Binding class. The xblParent node for the molecule element is the svg node within which the cml document fragment is defined. The subsequent elements' (i.e. atomArray, atom) parents correspond to the element above them in the cml tree.

ClassesCML.png

Binding class and CML elements

Figure 3: Relationship between Binding class and CML elements

3.1 CML sXBL Binding Definitions

The binding definitions are placed in another file and imported using <xbl:import bindings="cml-xbl.svg"/>. In the script file, the getURL() and parseXML() functions from the Adobe SVG viewer were used parse the definition nodes in memory.

3.1.1 sXBL Definitions for cml:cml and cml:molecule

The cml:cml definition creates a new viewport and instructs to bind its children elements. The cml:molecule definition makes use of the includes attribute to selectively bind its children at different locations in the shadow tree. A simple XML Path Language (XPath) expression was used. The molecule definition also includes a handler to add the name of the molecule to the shadow tree.

        
        <svg xmlns:xbl="http://www.w3.org/2004/xbl"
             xmlns="http://www.w3.org/2000/svg"
             xmlns:cml="http://www.xml-cml.org/schema/cml2/core"
        >
        <xbl:xbl>
        
          <xbl:definition id="cml" element="cml:cml">
            <xbl:template>
                <svg><xbl:content/></svg>		
            </xbl:template>
          </xbl:definition>
        
          <xbl:definition id="molecule" element="cml:molecule">
            <xbl:template>
                <g transform="translate(10,10)">
                    <text>read Molecule</text>
                    <xbl:content includes="date" />          
                </g>
               <xbl:content includes="atomArray|atom|bondArray|bond" />
            </xbl:template>
            <xbl:handlerGroup>
                <handler>
                    var newTextNode = null
                    var newTextContent = ''
                    var crtTxtNode = this.xblShadowTree.getParentNode().getElementsByTagName('text').item(0)
                    newTextContent = 'Molecule is ' + this.xblBoundElement.getAttributeNS('','id')
                    
                    if ( this.xblBoundElement.getAttributeNS('','convention') != '')
                        newTextContent += ' - Convention is ' + this.xblBoundElement.getAttributeNS('','convention')
                    
                    newTextNode = this.SVGDoc.createTextNode(newTextContent) 
                    crtTxtNode.replaceChild(newTextNode, crtTxtNode.getFirstChild())
                </handler>
            </xbl:handlerGroup>
          </xbl:definition>
         
         

3.1.2 sXBL Definitions for cml:atom

The cml:atom definition contains a bit more involved handler. It creates a circle whose size and color depends on the type of chemical element.

      <xbl:definition id="atom" element="cml:atom">
        <xbl:template>
            <circle cx="" cy="" r="" />
        </xbl:template>
        <xbl:handlerGroup>
            <handler>
                /* skipped lines */
                if (this.xblBoundElement.getAttributeNS('','x3') != '') {
                    cx = this.xblBoundElement.getAttributeNS('','x3')*coordFactor
                    cy = this.xblBoundElement.getAttributeNS('','y3')*coordFactor
                } 
    
                if (cx!=null &amp;&amp; cy!=null) {            
                    /* skipped lines */
                    c = this.xblShadowTree.getParentNode().getElementsByTagName('circle').item(0)
                    switch(this.xblBoundElement.getAttributeNS('','elementType')) {
                      case 'C':
                        atomColor = 'darkslategrey'
                        rad = 6
                      break
                      case 'O': 
                        atomColor = 'cornflowerblue'
                        rad = 8
                      break
                      /* skipped lines */
                    }
        
                    c.setAttributeNS('','style', 'stroke:black;stroke-width:0.25;stroke-opacity:1.0;fill:' + atomColor)
                    c.setAttributeNS('','r', rad)
                    /* skipped lines */
            </handler>
        </xbl:handlerGroup>
      </xbl:definition>
         
         

3.2 Creation of the shadow tree using Ecmascript

The display of the CML namespaced elements in the SVG file is done by inserting a new SVG namespaced tree (shadow tree) into the SVG file. This shadow tree is created by generating a new SVG element node for each cml element which maps to an XBL definition. The ecmascript performs these steps to render the SVG file with a CML tree:

  1. imports the file with xbl:definitions
  2. for each cml element
    1. find the xbl:definition
    2. create the shadow tree and apply handler scripts
    3. bind the children elements specified in xbl:content
  3. display the shadow tree within the SVG document

3.2.1 Ecmascript CML Binding Class

The binding class whose constructor is presented below is instantiated for each cml element that will be part of the shadow tree.

    
    function Binding (cmlNode, parentNode) {
    
        this.xblBoundElement = cmlNode
        this.SVGDoc = SVGDocument
        this.setParentNode(parentNode)
        this.setDefinitions()
    }
     
    

3.2.2 xbl:content handling

For the xbl:content element, a simple XPath expression was used. XPath was chosen because it is becoming ubiquitous in all XML manipulation languages. It has the potential to offer more sophisticated selection than CSS (partly because XPath implementations are more common than CSS3 ones). Also, depending on the markup language to be bound, sXBL authors may need more complex selectors like the ones in XPath1 and soon in XPath2. New markup language are evolving and popping up constantly, if sXBL is to be generalized as XBL, it would be best not to limit selector options to be able to address requirements for new markups.

The code used for binding all children elements is shown below. Once the child shadow tree is created, it is cloned to be appended to its parent node.


    var child = null
            
    if (xblContent.getAttributeNS('','includes').length == 0) {

        child = new Binding(this.xblBoundElement.getChildNodes().item(y), this.xblShadowTree)
        
        if (this.xblShadowTree != null) {                    
          if (child.getShadowTree() != null) {
                boundParent.appendChild(child.getShadowTree().cloneNode(true))	
          }
        }            
    }      
    

4. Examples of CML displayed using sXBL

The examples below show the display of the CML molecule embedded in SVG documents on which sXBL bindings have been applied.

4.1 Caffeine molecule

caffeine_show.png

Figure 4: Caffeine CML SVG rendering with sXBL

4.2 Diazepam molecule

diazepam_show.png

Figure 5: Diazepam CML SVG rendering with sXBL

5. XML User Interface Language (XUL) CML Viewer

5.1 CML, RSS and XUL

A Resource Description Framework (RDF) format for CML has been proposed in order to better share chemistry information over the Internet as part on the Science Semantic Web effort [The Chemical Semantic Web].

The Mozilla foundation's applications, notably Firefox, user interfaces are designed using an XML vocabulary called XUL. In XUL, dynamic data sources are read from RDF formatted information. In addition, Firefox provides the ability to be extended with lightweight plug-ins developed in XUL. Therefore, an XML User Interface Language (XUL) application was developed to build on the current XML technology building blocks: CML, SVG, RDF and XUL. The XUL CML Viewer displays metadata from a CML RDF file and the graphical representation of the molecules using sXBL.

5.2 Screenshot of XUL CML Viewer

screenshot75p.jpg

Figure 6: Riboflavin molecule in CML Viewer

6. Conclusion

Compared to XSLT and compiled language implementations, the advantages of using sXBL to render CML with ecmascipt are:

The only downside could be the decrease in performance if large definition nodes and their shadow trees have to be held in memory. For CML more specifically, the drawback is that SVG does not allow for 3D rendering. This is a must-have feature for viewing large molecules such as proteins and DNA.

Bibliography

[CML specs]
Chemical Markup Language (CMLCore schema) 2003-07-10
[sXBL specs]
SVG's XML Binding Language (sXBL) W3C Working Draft 05 April 2005
[CML SVG]
Chemical Rendering using SVG (Scalable Vector Graphics) and CML (Chemical Markup Language) 2004-11-29
[CML wiki]
Chemical Markup Language Wiki Has link to Jumbo application, a molecular viewer.
[SVG specs]
W3C Scalable Vector Graphics (SVG)
[Croczilla]
Croczilla Mozilla XBL and CML examples
[JMol applet]
JMol applet Used to view molecules
[CML files]
Chimeral site
[Marvin viewer]
Chemaxon's Marvin molecular viewer
[Open Babel]
Open Babel converter of many different chemical molecular formats including CML
[The Chemical Semantic Web]
The Chemical Semantic Web: The future of Science Communication and Publishing 2005-02-03

XHTML rendition made possible by SchemaSoft's Document Interpreter™ technology.