SVG as a Page Description Language

Alex Danilo
Canon Information Systems
Research Australia Pty Ltd
1 Thomas Holt Drive
North Ryde NSW 2113, Australia
e-mail: alex@research.canon.com.au
phone: +61-2-9805-2057

Jun Fujisawa
Canon Inc.
3-30-2 Shimomaruko, Ohta-ku,
Tokyo 146-8501, Japan
e-mail: fujisawa.jun@canon.co.jp
phone: +81-3-3758-2111

Keywords: SVG; PDL; Page Description Language; Printing Profile

Abstract

SVG has matured into a rich, fully featured graphics language resulting in its suitability for all traditional graphics applications. The SVG working group is continuing development of various profiles for use in specific application areas, such as mobile devices.

One of the most important uses of computer graphics languages is in the area of printing. Many languages used for printing are proprietary and display various feature sets. SVG in contrast is vendor neutral, contains much of the functionality of existing languages for printing and is a wonderful candidate for future hard copy devices.

A new SVG profile for printing is being developed as part of the SVG standardisation effort. The unique requirements for printing are driving the development of a suitable subset of SVG which will guarantee consistent imaging performance on printers. The standardisation process will most likely produce a suitable profile for SVG printing as well as guidelines to help developers produce compliant SVG files.

Issues that are being addressed with regard to printing with SVG include dealing with pagination, job control and so forth. Implementation on memory constrained devices with high resolutions requires consideration in the format of SVG files suitable for successful printing.

SVG includes the ability to perform animation. Printers are not capable of animation. Thus, one of the limitations of SVG files for printing is that animation be either eliminated or defined to display the starting frame, an ending frame or a specified alternate. These issues have yet to be addressed, but indicate some of the work that is being performed.

The colour requirements of printers are well addressed in many existing printer languages. SVG requires some extensions to support color features which are common in printing devices today. Features such as spot color, ICC profile support, trapping, etc. are used in many devices, but are currently lacking in the SVG specification.

We have done some analysis with regard to printing with SVG. This includes the development of prototype implementations to evaluate the suitability of design decisions. We will discuss our experience with using SVG as a page description language. Our results will be used as the basis for the development of a working SVG printing profile. This process is continuing as part of the SVG working group charter.

Memory constrained devices such as printers have been well addressed with various techniques such as banding and compression. SVG contains a number of filter and transparency effects which are moderately easy to do for screen resolutions. A printer page may contain hundreds of megabytes of image data per page, thus dealing with filtering and transparency provides an interesting challenge in such memory constrained printers. Our experience in dealing with high resolution compositing will be described, along with some results.

Other layout languages, such as XHTML are useful for applications such as printing. We will compare SVG with XHTML as a language of choice for printing applications. Similarly, XML based layout such as XSL-FO is a good choice for controlling the appearance of printed output. Both XHTML and XSL-FO are layout languages which manage the placement of objects within a given page area. SVG in its current form is a presentation language. This means that the layout algorithms are controlled prior to the SVG rendering and presentation. As such, SVG as a page description language may be used as the output render target under the control of a higher level layout control language. Thus, SVG as the final presentation form complements true layout languages to form a complete print rendering solution.

The existing SVG specification is a good basis for the development of a definitive page description language. Once the specific needs of printing are addressed with the appropriate extensions to support pagination and extended color support, SVG printing has the potential to become ubiquitous.

1. Introduction

Production of hard copy output has undergone continuous development resulting in the ever present print device. Production of presentation media, high-resolution output and so-forth is usually done using some form of page description language (PDL). The advantage of using a PDL is device independence and lower data transfer volume to the printer. Languages for printed output share a number of characteristics. These characteristics are by design, and are intended to streamline the printing process. A number of features in SVG show similar characteristics, however some features are unusual from a printing perspective.

The generation of paginated output is a feature of most PDLs, and SVG has yet to address the specific needs of the printer industry. Many standards for managing job control exist, and may be suitable as an adjunct to SVG to provide a suitable workflow solution.

Most PDLs are proprietary standards. They have been published, but by and large they are dominated by a single vendor. An attraction to using SVG for printing is that the standard is open and compatibility issues across implementations becomes an issue due to correct implementation of the standard rather than vendor bug emulation.

A number of features of SVG require consideration when used for a print device. Dealing with animation and scripting is quite obvious. However other features such as filters require serious consideration as they burden the implementor with memory consumption issues. Addressing the use of filters is an important part of SVG use as a PDL.

Many current PDLs contain so-called high-end colour features. These features are print world specific and do not currently exist in SVG. Some of these features have no relevance to a display device such as a CRT, but are required in order to match current PDL feature sets. The colour pipeline of modern PDLs supports a number of colour channel types in parallel. SVG can support some of them, and with a small amount of extension could be tailored to become a viable alternative to such PDLs.

SVG in its present form is a pure presentation language. As such its display is predictable. All current PDLs are presentation languages. This makes SVG particularly appealing as a PDL. Other display languages such as XHTML, CSS, XSL-FO and so-forth are layout languages. Layout languages are unsuitable for printer devices for a number of reasons. The combination of a layout language with SVG as the target PDL for output provides a sensible workflow solution for XML based documents.

The suitability of SVG as a PDL is unquestionable. It combines rich graphics functionality, sophisticated text and font control and most of the functionality present in all PDLs in current use. Once vendors realise its usefulness and begin to produce SVG hard copy products, we expect SVG will become dominant in the print industry.

2. Traditional PDL feature sets

There are many PDLs in use today. The most dominant languages are PostScript and PDF from Adobe corporation and the PCL family of languages from Hewlett-Packard. PCL covers two totally different styles of language, known as PCL5 and PCL-XL.

These languages incorporate a number of common features, these being:

2.1 Device control

All PDLs provide a mechanism for dealing with the properties of the printer device they manage. This includes management of paper size, duplexing, document finishing options and so on. In implementations this is usually controlled via a device specific language extension.

SVG contains no device specific control mechanism. In order to provide equivalent functionality to existing PDLs without burdening the SVG developer, it would be prudent to use SVG in conjunction with an existing device management language. The print industry relies on a number of standards to manage workflow. A good industry developed standard called Job Definition Format (JDF) is suited to such an application, although there may be many alternatives.

The IEEE has a print working group charter (www.pwg.org) which is developing other suitable standards, although JDF is render language independent, mature and in current use.

Based on the assumption that SVG will be used in conjunction with a workflow management language that deals with device specific requirements, SVG print jobs still need to define an appropriate view box in order to ensure that graphics are scaled correctly to generate suitable output. For example, a page generated for portrait orientation on output could be badly rendered if the target is landscape paper and the image scales to fill the viewport.

Given this consideration, each SVG page should specify a desired size (using CSS sizing units) to allow the printer device to choose the most appropriate size of paper for output. Mixed documents containing different page sizes are then easy to manage, e.g. the case where a number of A4 pages are interleaved with A3 folded inserts.

2.2 Font and resource management

Modern PDLs allow the downloading of fonts, ICC profiles, and so on.

SVG supports the downloading of SVG fonts which are extremely versatile. One of the best features of SVG is that an authoring application can download only the glyphs which will be used for display. This is a huge advantage over other PDLs which only allow the download of entire fonts.

ICC profiles are used extensively in the print industry to perform colour correction and colour conversion. They are also used to provide different render intents for different types of image data. For example, a digital image may be rendered using one intent which provides best possible output hue linearity, whilst computer generated graphics may be rendered using an intent which accentuates saturated colours and is unsuitable for images but provides a pleasing result for graphics.

SVG can support different render intents for any object via the use of ICC profiles. This gives it a powerful mechanism for colour management of image data. Such functionality is present in other languages, and so SVG is suited as a replacement for those PDLs.

One thing that many PDLs do is store such resources on the device for later use by other print jobs. SVG as a format is self-contained, so the language inherently does not provide the idea of storing things like fonts, colour profiles and so-forth between SVG 'sessions'. This is a good thing. The concept of stored state generates modal behaviour, such that a stored context device relies on some a priori knowledge of what has been stored in the printer. SVG in contrast provides predictable behaviour in guaranteeing that the starting state is identical for all jobs.

It may seem desirable to include some form of stored resources, however as an explicit language feature it introduces problems with resource availability.

Taking the ICC profile case as an example, the SVG renderer will fetch the profile named in the SVG source data. If a subsequent job requests the same ICC profile, it is more sensible to deal with this by introducing a caching mechanism in the printer. This is far more desirable than forcing the user to guarantee that required resources are preloaded before submitting a print job.

2.3 Vector Graphics

SVG is of course Scalable 'Vector' 'Graphics'. Most modern PDLs introduced the concept of vector graphics in the 1980s. PostScript of course was a language based entirely on vector graphics which grew out of research done by Xerox.

The obvious benefits of vector graphics will not be lost on anyone dealing with SVG, but the most obvious one is device independence. A high-resolution printer device will produce the same output as a low resolution video display with the same input data.

Bitmap graphics have been next to useless for printers as the amount of data required is excessive when high resolutions are used. Digital images remain the only valid use for bitmap data.

The history of the PCL family of languages provides an objective lesson here. Following the introduction of PostScript, Hewlet-Packard 'glued' their existing raster based language to a vector language they had developed known as HP-GL. This development provided them with much of the functionality of PostScript for graphics, but the resultant language was a conglomeration of two distinct internal languages, the use of which was controlled by a modal switch between them. As time progressed, HP introduced PCL-XL which bears no resemblance to PCL5, but instead is a language far more like PostScript in that it deals with paths, fills, etc. and sheds most of the raster legacy. In fact it resembles SVG today in a number of ways. SVG has a richer inherited graphics state model, but PCL-XL was an attempt to rectify the problems of the earlier raster based language.

2.4 Pagination

All PDLs in production for office style products contain explicit page completion commands which force the paper to be rendered and ejected from the printer.

SVG lacks any explicit page management commands, but does contain the concept of 'views'. Views may be used to provide successive display of areas of an imaginary canvas consisting of all the pages which form a document. A drawback to this technique is the requirement to have the entire SVG file in memory prior to constructing the individual page views. Traditional PDLs discard data after a page is printed, and this is a requirement due to memory considerations in embedded devices.

The JDF language is ideally suited to managing pages as individual files per page. As such, it would be desirable to construct SVG print jobs using an individual SVG image per page and 'wrap it' using JDF or similar.

3. Memory and SVG for printing

Management of memory use on a printing device is one of the single most important issues. Due to the high resolutions inherent in printers, frame store memory is limited and sometimes totally unavailable. Banding is commonly used to deal with rendering of a page in strips so as to reduce the total memory requirement.

On a PC, a full page SVG image on a 1280*1024 display will use approximately 4 megabytes of memory. This is a reasonable memory footprint, and a PC has the luxury of virtual memory should there be limited main memory.

A printer is not so fortunate. An A4 page at 600 dots per inch (dpi) consumes approximately 132 megabytes of memory (assuming four channel CMYK data). This memory use is significant, and many devices take advantage of a technique called banding to reduce the short term memory requirement.

A typical banding architecture can be seen in the diagram below:

banding.jpg

Figure 1: Architecture of the system

A banded system will typically parse the PDL, generate display lists from the graphics objects in the PDL and attach these display lists to a band descriptor. As the display lists are generated, the input PDL data is discarded. Once the end of page command has been issued, the display lists are rendered band by band into a shared pool of band memory prior to transmission to the printer engine.

Modern laser style printers have a strict timing requirement for rendered data transmission to the print engine, thus the bands must be rendered at print engine speed or the print will fail.

Some printers contain a full-frame store in which all the memory to represent an entire page exists in the printer controller. Such controllers are rare as the production cost is excessive, and so, most devices contain less than an entire page of memory.

SVG is suited to banding, however there are two obvious areas of concern. These are:

3.1 The Document Object Model (DOM)

The DOM is an XML document concept in which the XML document is parsed to construct an in-memory tree representing the document contents.

When printing, rendered data is usually discarded after parsing for minimising memory use reasons. As such, a DOM is unsuitable for SVG as a PDL target. For example, large amounts of image data may be present in an SVG file, and so that data needs to be discarded as soon as possible.

When rendering PostScript, PDF, PCL, etc. an (almost) infinitely long document is possible in the case where a full frame store is present. Such a render controller will parse the data, render it to the frame store, discard the input data and continue ad infintum.

A DOM based renderer will consume memory proportional to the document length and thus will be unsuited to the requirements of a printer.

The two common XML parser types, DOM and SAX deal with XML differently, and it seems that a SAX based approach is most appropriate for printing applications.

Given a SAX based SVG renderer, we end up with SVG that cannot perform animation or scripting. Some form of DOM is needed to support both animation and scripting.

Animation is not possible in a printer (unless you print flip-books of course), and so the restriction of eliminating animation in SVG print targets does not seem unreasonable.

The other DOM dependent feature - scripting is more questionable. Eliminating scripting provides a simple solution to the problem of DOM elimination, however authors may feel that scripts are needed.

One should again parallel the history of PDLs. The purpose of scripting is to provide some form of programmability for the language. For example a script is useful for providing interactivity in screen based applications. However, a script may be a convenient way to draw graph paper for example. PostScript is a programming language with graphics features. It was a well known, though not necessarily market domineering language for a number of years. As an interchange format for phototypesetting it was well used and gained wide acceptance. It could be argued that the programmability was the reason for this.

However this is not the case. Soon 'flavours' of PostScript such as Encapsulated PostScript (EPS) were introduced to limit the language functionality for import into graphics applications as well as graphic data interchange. Later the PDF language was introduced which did not provide the programmability of PostScript. The predictable nature of PDF has seen it grow into a position as the preferred format for print data interchange. This is due to its predictability of reproduction more than anything, but the lack of programmability has not been seen as a hindrance. Quite the contrary in fact, as putting a PostScript printer into an infinite loop can be especially tedious if there is a 'showpage' command in the inner loop!

We would therefore argue that elimination of scripting is desirable in SVG targeted at a print device. This also gives us the benefit that we can eliminate the DOM and thus control our memory usage far more effectively.

3.2 Filters

Filters are a very useful feature of SVG which allows many complex effects to be achieved in small amounts of SVG data. They are however an area of concern for printing implementations.

The main difficulty with filters is that they can be computationally expensive and potentially memory hungry. For example, a gaussian blur with a large standard deviation may require access to large amounts of pixel data to generate the desired output image. On screen, the extra memory is not of concern, but in a memory limited device it can be.

In a band render based system it may be possible that the filter effect requires access to more pixels than are available in all of the band buffers.

In cases where the memory requirements are beyond the capabilities of the printer device, it would seem prudent to gracefully degrade rather than disable the filter effect. In the gaussian blur example, limiting the effect area would serve to manage the problem.

Elimination of filters from SVG print targets is unreasonable as many graphics operations are far more difficult when filters are not available.

A second problem with filters is the computational overhead required. In a simple display-list and band based renderer, calculation of display list processing time is commonly used to guarantee that the bands will be rendered fast enough to maintain engine speed. Some of the SVG filter types can consume large amounts of CPU time, and as such this needs to be addressed adequately.

Generating metrics for the effect of filter computation time seems tractable, but it is important to consider the ramifications.

If a printer device 'decides' that a particular filter effect to be used on a page will fail to print, then it could choose to limit the filter effect. The user will see the output and assume that this is the result of the given SVG file. If such a file is then interchanged with a typesetter bureau with a powerful full-frame store device, the filter may be applied and the result will be different. Thus, the presence of gracefully degrading filter implementations may serve to introduce incompatibility issues.

Achieving a successful balance between the need for filters and the potential drawbacks of their use is a slight area of concern.

4. Colour processing

Colour management is an integral part of many PDLs. Simple office equipment systems seldom require any special colour processing, however devices used for proofing, digital (camera) image output, phototypesetting and similar require advanced colour management.

4.1. ICC profiles

SVG supports the use of input ICC profiles. These profiles can be used as render intents and typically perform colour correction functions. Colour correction is the process of correcting colour linearity problems whilst retaining the same colour space. Colour conversion is the process of converting between different colour spaces.

At present SVG lacks the ability to specify an output ICC profile to target a specific printer device. This may be addressed by loading the desired output profile via some other mechanism, such as the wrapped JDF job or similar.

There can, however, be cases where the SVG itself should load a given output profile explicitly. Such a case is in the case of an 'ink simulation' job.

Ink simulation is used when proofing for a phototypesetting device. An ink simulation profile is loaded by the proof device job which contains toner mappings that simulate the result when transferred to the real phototypesetter. It may be argued that this load belongs in the wrapped SVG job rather than the SVG itself, however current industry practice is to refer to the profile from within the PDL, and thus current workflow procedures should be respected.

Another commonly used feature is so-called 'spot colour'. This is used when an exact colour is required for output. For example, a company logo may use a specific Pantone colour. SVG jobs displaying this logo would need to mark the logo object as being coloured with the desired colour value. The ICC group have defined a 'named colour profile' which contains explicit colour names with associated toner values for the render device. Support for named colour profiles is currently lacking in SVG and is required to replace current PDL systems.

4.2 Colour spaces

SVG renders in sRGB. Whilst this is sufficient for screen based additive colour devices, it could be extended to support multiple colour spaces including subtractive spaces such as CMYK.

Many PDLs in use today support a mixture of RGB, CMYK objects in a single print page. An example is Adobe's PDF. Within PDF, individual objects may be specified to be rendered in a given space, and the result is colour converted to the device target space prior to printing.

When specifying CMYK as an input colour, it usually refers to device CMYK and so, such objects undergo no colour correction during render.

An example of a modern colour rendering pipeline is shown below:

colorpipe.jpg

Figure 2: Colour rendering pipeline using Lab render

As can be seen, distinct colour paths exist for different graphics objects in a single image. SVG contains a good basis on which to extend the colour functionality to match existing PDLs.

Inclusion of extended colour space support will ease the conversion of existing PDL files into SVG.

5. SVG is a presentation language

One distinguishing feature of SVG when compared with other XML based display languages is that SVG is a true presentation, as opposed to layout language.

Languages such as XHTML with CSS, XSL-FO, etc. are layout languages. They specify objects to be displayed but do not specify where to place such objects. This is a serious problem if such languages are to be used for printed output. The single characteristic that is most important in a PDL is predictability of output. PDF has gained dominance in the print field due to its total control of placement of everything, from glyphs to images.

Early PDLs contained implied font metrics on print, but these have been phased out in favour of languages which specify exact placement of every glyph. The benefits are obvious - the PDL becomes a true WYSIWG representation of the image. This is in stark contrast to so-called WYSIWYG applications that print differently to their screen representations due to font metric differences, device resolution layout errors, etc.

SVG is ideally suited to such uses as it contains all the functionality required to place each glyph in a chosen location, and thus provides the ideal target for reproducible output.

The IEEE PWG is standardising other print targeted languages which allow layout, such as XHTML-Print. Such efforts are suited for simple printing in office environments, but are totally unsuited to the professional print industry.

XSL-FO and the XHTML/CSS combination are languages whose purpose is communication of content, not presentation. They are powerful and provide maximal flexibility in placement of objects. These languages are ideally suited to an authoring environment and resizable viewing environment. The use of such languages in a printer device is dubious at best. One of the prime requirements of a printer device is knowing where everything will be placed. This is especially important in the pre-press industry. If a word is placed too far to the right edge of a printed page, it should be clipped, not wrapped. This distinction is important as it allows the target print data file to be interchanged with other devices with known results.

Thus, the use of SVG in a print environment is ideal as the final presentation target.

A typical workflow for an XML based magazine production could have an XSL-FO renderer with built in scripting running a WYSIWYG typesetting application that flows text and graphics. On completion of a document, the selection of export to print uses the XSL-FO engine to send the placed objects out through a filter which writes SVG containing exact positioning of all the objects which make up the page. In this scenario, XSL-FO is the layout engine, SVG is the output representation which is totally device independent much like PDF is today.

The use of XML based layout languages in conjunction with SVG as a target presentation language seems to be obvious and complementary.

It is of course a logical extension to envisage mixed namespace documents which contain XSL-FO, MathML and SVG input images which get processed into a single targeted SVG output representation.

6. Conclusion

Today SVG is in use in many environments ranging from basic image display through to full-blown interactive applications. The language has been constantly evolving into a powerful, modern graphics language suitable for many uses.

Using SVG as a representation for hard copy output is both logical and desirable. The co-operation in the development of the SVG standard has brought it to the point where proprietary page description languages pose no benefit in use.

Once SVG becomes a page description language proper as evidenced by industry adoptance in both printer devices and as an interchange format, it will continue to expand its market penetration. Market adoptance of a vendor neutral page representation should be encouraged to accelerate adoption of XML based imaging technology devices in the hard copy industry.

References

[1] "Scalable Vector Graphics (SVG) 1.0 Specification", J. Ferraiolo, 04 September 2001. Available at http://www.w3.org/TR/2001/REC-SVG-20010904.

[2] "Postscript Language Reference Manual", Addison-Wesley Pub Co, ISBN:0201379228.

[3] "PCL Technical Reference", Technical Reference Manual set, HP part number 5021-0377.

[4] "International Colour Consortium ICC file format specification". Available at http://www.color.org/newiccspec.pdf.

[5] "Job Description Format JDF specification". Available at http://www.cip4.org/documents/jdf_specifications/JDF1.1.pdf.

[6] "Pantone website". Available at http://www.pantone.com/.

[7] "XHTML-Print", D. Wright, M. Grant, P. Zehler, J. Fujisawa, 24 May 2002. Available at http://www.pwg.org/xhtml-print/HTML-Version/XHTML-Print.html.

[8] "Extensible Stylesheet Language (XSL) Version 1.0", S. Adler, A. Berglund, J. Caruso, S. Deach, T. Graham, P. Grosso, E. Gutentag, A. Milowski, S. Parnell, J. Richman, S. Zilles, 15 October 2000. Available at http://www.w3.org/TR/2001/REC-xsl-20011015/.


Valid XHTML 1.0!