Accelerating SVG Transformations with Pipelines

Generating SVG by transforming XML data streams is an important use-case for SVG, however the power and flexibility of XSLT transformations can come with a significant performance cost. Pipelines can significantly improve the performance of these transformations.

This paper takes a look at using XML event pipeline patterns (Pipelines) to transform XML data streams into SVG, how these patterns relate to XSLT transformations in design and performance, and how new technologies like STX (Streaming Transformation for XML) fit in. Particular attention will be paid to processing GML, an XML format for geographic data, and examples are provided in Java and .NET.

What are XML Event Pipeline Patterns (Pipelines)

Pipelines allow XML to pass through multiple transformations without incurring the memory or performance cost of generating intermediate XML documents. Pipelines treat XML documents as a series of events that pass through a single or branching sequence of filters. Pipelines can, and often do, include XSLT transforms, however this paper discusses pipelines that perform predominantly forward-only operations and do not use XSLT.

An example of where a pipeline can be effective is a website that creates SVG maps from a GML generator. Instead of generating an XML document, the GML component is set to produce SAX events. The GML SAX events are then sent to one or more filters that apply style rules and convert the GML SAX events into SVG SAX events. Finally the resulting SVG SAX events are written directly onto the HTTP output stream as SVG data. Such an approach can be significantly faster than XSLT, while supporting the flexibility of using intermediate XML formats.

When to Use Pipelines

XSLT is a powerful and highly flexible tool for transforming XML data into SVG. However, because of the performance cost and some aspects of XSLT, pipelines should be considered when one or more of the following is true:

However, some SVG transformations can be difficult to implement using pipelines, especially transformations that incorporate other data streams or require multiple references within the source XML data. Also pipelines do not have the portability and platform-independence of XSLT.

Design Considerations

The forward-only nature of pipelines impacts the design of both the input XML and the output SVG. For example, dynamically setting an SVG element's "width" attribute based on the combined widths of child elements is difficult with pipelines.

Taking advantage of optional elements in the input data can help. For example, ensuring that GML's <boundedBy> element is the first child of a feature collection can simplify sizing SVG content such as borders to subsequent child elements.

Some SVG design practices can also make pipelines much easier to develop and maintain, such as placing style information in high-level <g> tags or CSS styles. SVG scripts triggered by onload events can also be a useful workaround.

Newer technologies, like STX, can further assist in SVG pipeline development. STX provides a combination of the speed of pipelines and the expressiveness of XSLT, simplifying the implementation of style rules.

Adapting SVG Systems for Dual Use (XSLT and Pipelines)

Any component that produces XML data can use either XSLT or pipelines to produce SVG. However, how the XML data is produced will affect the performance of a pipeline. If the component produces a DOM object, the performance benefit of a pipeline may be small, since the cost of building a DOM has already been incurred.

The best combination of flexibility and performance can be achieved by using an XML serial writer (such as a Java XMLWriter that implements a SAX's IContentHandler interface or .Net's System.Xml.XmlWriter). These tools can be adapted to feed data directly to a pipeline, maximizing the performance benefit.