Converting 3D Facial Animation with Gouraud Shaded SVG

A method to extract 3D animated files and to convert it in SVG animations or SVG images with Gouraud shading

Keywords: SVG, facial animation, mobile devices, 3D-2D conversion

Thomas Di Giacomo
Miralab - University of Geneva

Michel Gaudry
Miralab - University of Geneva

Nadia Magnenat-Thalmann
Miralab - Univesity of Geneva


The demand for leisure and entertainment, the interest for 3D games and graphics, are continuously increasing. Though many devices support 3D graphics, every mobile terminal can not embed the required hardware to display complex 3D scenes and environments. On the other hand, many mobile devices are designed to support natively vector graphics, and especially SVG, and hence they offer direct optimizations for the play back of such contents. Consequently, we propose to convert generic 3D talking heads, with animations and deformations, to visually similar SVG animations. Our motivations are to allow the rendering and the visualization of dynamic and complex 3D facial animations with SVG 1.1, and to provide innovative SVG authoring.

Basically, the method consists in a real-time extraction of the relevant 2D data from the 3D facial VRML models and animations, and in the generation of the corresponding SVG files.Two kinds of scenes can be generated : optimized SVG animations or static objects with Gouraud shading. This paper presents our approach and implementation for accurate conversions and efficient rendering of complex 3D scenes with SVG. Using such methods could extend the availability of 3D graphics and their combination with other media, as well as the usability and the authoring methods and strategies of SVG contents.

Table of Contents

1. 1. Introduction
2. Related Work
     2.1 3D on Mobile Devices
     2.2 Conversion of 3D to Other Media
          2.2.1 To Image and Video
          2.2.2 To Scalable Vector Graphics
3. 3D to SVG Converter
     3.1 Context and Requirements
     3.2 2D Data Extraction
     3.3 Generation and Optimization of SVG
          3.3.1 To Produce a SVG Copy
          3.3.2 To Optimize Size
          3.3.3 To Generate Gouraud Shading
4. Experiments and Evaluation
     4.1 The Pyramid and Optimizations
     4.2 The Talking Head and Deformations
5. Conclusion

1. 1. Introduction

With the emergence of light devices such as mobile phones and personal digital assistants (PDA), and with the growing interest for Web Graphics, the range of devices and applications where 3D graphics are required is consequently increasing. However, many of these devices remain unable to display 3D, because of its high power consumption and high demand in capabilities, while most of them natively support other graphics representation. For instance, vector graphics (VG) is a widely available and adopted technology on mobile phones, through the conformance to appropriate VG standards such as Scalable Vector Graphics (SVG), and through the use of possible hardware support compared to 3D technologies. On the other hand, the demand and uses of 3D graphics on such target clients are already present, with the interest for games and for 3D presentations. To cope with this situation, to enable the display of 3D-like animations on todays non 3D capable or less powerful devices, our work aims to convert and adapt 3D animated scenes to SVG animations. Therefore, we propose a method to convert generic 3D to SVG by extracting appropriate 2D data, with the primary goal of preserving the 3D visual appearance in the SVG replica as much as possible. To extend the current existing conversions, we target large 3D scenes with a high focus on face animation and deformation conversions. We also want to reproduce shading of 3D objects by implementing Gouraud in SVG. A secondary goal of our work is the possibility to author SVG animation in a new unordinary way, i.e. by taking advantage of a 3D space and of 3D specific authoring tools. SVG animations crafted from an original motion captured 3D sequence, or from a highly realistic 3D rendering, would enhance the creativity of graphics designers for the web and mobile applications. This paper contributes to the areas of Computer Animation and Web Graphics and Multimedia with a novel approach for accurate conversions and an efficient rendering of complex 3D animated scenes with animated SVG. Additionally, it extends the availability of 3D graphics and its combination with other media, as well as the usability and the authoring methods and strategies of SVG. The general purposes of our work are to:

The paper is organized as follows: Chapter 2 first discusses the related work in the field of 3D graphics for mobile devices, as well as 3D conversion to images and 2D VG. Chapter 3 presents the framework and requirements of our approach and the methods and steps used in our conversion process: the extraction of 2D data and the generation of the copy SVG animation or SVG image. Experiments and results are detailed in Chapter 4, while Chapter 5 concludes with perspectives and on the proposed 3D to SVG conversion method.

2. Related Work

Rendering and animating complex 3D scenes on mobile and light devices is not straightforward, as illustrated by the discussion in Section 2.1. It appears that the most currently viable solution, and the only one for lightest devices, is to convert 3D to other media types or to other representation, e.g. video, image, or VG. Section 2.2 presents some significant work that includes such conversions.

2.1 3D on Mobile Devices

The ever growing interest on mobile devices creates a need for the availability and delivery of 3D graphics on any hardware. Regarding the different capabilities of mobile devices, various solutions have been explored to overcome hardware limitations. Belz et al. [BJS97] proposed a hybrid approach that combines 3D on standard machines coupled to 2D on mobile devices with a network. Lamberti et al. [LZS*03] takes advantage of a cluster of PC, to render complex sequences, and uses mobile devices to display the rendered image. To address 3D rendering issues on mobile devices, these approaches restrict the local 3D processing by deriving computations on a remote server. As an extension to his original method, Stam proposed in [Sta99] an implementation of stable fluids on PDA, using fixed-point arithmetic to address the lack of FPU on these devices. Focusing on a particular hardware, Kolli et al. [KJB02] details some optimization techniques towards ARM-based PDAs. Though being efficient, such methods lack of genericity and would require further work to be adapted on different platforms. Another solution being explored is the adaptation of 3D content. Similarly to standard level-of-detail techniques, such as proposed by Hoppe [Hop96], Di Giacomo et al. [DJGMT03] propose to simplify facial and body animation for PDA rendering. Currently, the lowest level 3D representation still remains prohibitive to be appropriate for mobile phones. Consequently, it would be relevant to take advantage of another representation for 3D as a lower level of complexity for adaptative systems.

2.2 Conversion of 3D to Other Media

For various reasons, e.g. in order to reduce the memory footprint, the processing load, the power consumption, to optimize rendering performances, researchers have worked on alternative representations for 3D scenes such as voxels, particle systems, or point-based representations for instance. Though our initial goals are different, it is still relevant to consider some of these approaches to identify possible conversion mechanisms applicable to our context and usable in the scope of light devices.

2.2.1 To Image and Video

Image-based techniques, when focusing on the simplification of 3D rather than on the improvement of the rendering realism, provide clues on the possible adaptation of 3D to other representations. For instance, Maciel et al. [MS95] proposed to replace geometry with impostors, i.e. roughly semi-transparent textured quads. This technique was later extended by Schaufler et al. [SS96], and Sillion et al. [SDB97] who proposed meshed impostors to increase the realism of impostors by implicitely taking into account perspective with the parallax effect. Billboards belong to the same family of methods, recently Decoret et al. [DDSD03] introduced billboard clouds, which basically represent 3D models as textures with transparency maps. In the context of crowd rendering and animation, the representation of virtual human has received a particular attention by Tecchia et al. [TLC02] and Aubel et al. [ABT00]. These approaches animate impostors to reproduce the 3D animation of hierarchically-articulated bodies with animated textures. Image-based rendering (IBR) methods have also been explored on light platforms, such as PDA, by Chang et al. [CG02], who apply commonly used IBR methods to simulate 3D objects on such devices. The underlying idea of most of these methods is to project a 3D geometry on a plane and use the resulting image, as a texture, to represent the original 3D object. Unfortunately, the representation, though simplified when converted to images, still relies on a 3D environment because images are mapped as textures, and thus are not directly extendable to our approach where 3D support is completely lacking. The solution to render a complete 3D animation to a video is also not acceptable, not only because of the size of the resulting files but also as it would prohibit further editing and the intrinsic interactivity of SVG, such as zooming, for the resulting 3D copy.

2.2.2 To Scalable Vector Graphics

The interest for 3D conversion to VG is very recent and only a few works have tried to address this issue. One of the most significant work has been presented by Herman et al. [HHD02] that features a rotating 3D teapot in SVG, which more details and background considerations are presented by Hopgood et al. [HDH03]. Another major contribution to the representation of 3D with SVG is proposed by Otkunc et al. [OM03] and Mansfield et al. [MO03]. Using Javascript for the configuration and the computations of a software 3D renderer, both work consist in representing 3D with SVG and display it with their script-based renderer. Lin [Lin03] proposes a basic method to convert X3D files to VML, based on XML and XSLT, extended by Sangtrakulcharoen [San03] with a focus on SVG. Worth mentioning is Swif.t3D © [Rai03], a set of plugins to 3D authoring tools that exports Flash animations.

Despite these efforts, the use of SVG to represent 3D is currently limited to non-animatable and very simple objects, mainly because of the prohibitive amount of data and the architecture and structure of the current methods. Furthermore, though these approaches achieve conversions of 3D to SVG, they have been tested on 3D objects modeled by only a few polygons and without support for complex animations and deformations yet.

3. 3D to SVG Converter

Following the previous discussions, it appears that, for light devices and interactive web graphics, rendering 3D with VG is more suitable than with video or images. This Section details the core of our approach for a generic 3D to SVG conversion. The conversion is basically processed into two main steps, as illustrated by Figure 1: the identification and extraction of appropriate 2D data that serves as input to SVG generators, and the generation of SVG files as well as the optimizations we apply on this format. After introducing the scope and context of our approach in Section 3.1, Section 3.2 describes the methods involved in the extraction of the 2D data from the original 3D scene. Then, from the extracted 2D data, the resulting generation and optimization of SVG animations are presented in Section 3.3 Finally, in Section 3.3.3 we will present an additionnal generation method to improve rendering of 2D static objects by implementing Gouraud shading using SVG langage features (particularly gradients and filters).


Figure 1: High level overview of our 3D-to-SVG conversion.

3.1 Context and Requirements

Today, many different techniques are available to design 3D models and animations, and several different VG formats coexist. Our converter should be as less restrictive and format-dependant, in terms of input and output, as possible to support generic conversions. For inputs, VRML format has been selected for the 3D meshes representation but, since the conversion is not directly applied on the input file primitives but on the projected vertices as detailed in Section 3.2, any other 3D format would be usable. Though the output selected VG format of our converter is SVG [FJJ03], to ensure that our converter would also be generic with respect to the output format, its architecture is globally splitted into three main modules, as illustrated by Figure 2. By decoupling the extracted 2D data from the VG generation, the system would also be extensible with other VG formats by plug-in additional VG generators. Moreover, a predominant focus of our conversion is on the 3D animations and deformations. Both user-input, e.g. rotating, zooming the camera or the object, as well as predefined animations, e.g. facial or body animation parameters and files, for instance MPEG-4 FAP and BBA as discussed by Preda et al. [PP02], are supported. Though untested with our method yet, we believe that converting physically-based animations, based on mass-spring networks or finite elements methods, should be straightforward as visually these approaches mainly update vertices positions then handled by our 2D data extraction. Finally, color informations of the original 3D mesh will be used to generate a Gouraud 2D mesh.

Considering the targeted light devices, the playback should not require any additional softwares or features than those provided by a standard lightweight SVG player. Therefore, we prohibit in our method the use of script during the playback of SVG content. Though scripting could enhance the possibilities offered to the SVG recipient, with possible extended user-interactions or with a native 3D representation for instance, it would also limit the variety of devices able to handle such content.

The main criterion for a valid conversion is, visually, the similarity between the generated SVG animations and the original 3D animations. Thus, the conversion should preserve:


The 3D To SVG module plays 3D contents and extract 2D data stored in the 2D Data module. The SVG generator output SVG files accordingly.

Figure 2: Functional blocks of the architecture

The possibility to create converted objects with Gouraud shading will be limited to static objects from the original 3D scene. Another possibility, that has not been implemented yet, is to extract one frame from an animation, generate a SVG file with Gouraud shading and maintain this shading throughout the animation. Indeed, applying Gouraud shading to animate objects would be too heavy in terms of CPU and memory footprint (see Section 4.2).

Finally, a secondary requirement, which purpose is to ease the authoring of SVG animation with 3D, is to process the conversion in real-time. Though not mandatory, the interactivity of original 3D scenes becomes critical in the case of conversions with user-input animations.

3.2 2D Data Extraction

The first effective step of the conversion consists in extracting all the 2D data required to generate a SVG animation and to store it in a cache for a latter use by the SVG generator. Considering their behaviour over time, 2D data are separated into two categories:

After [Man02]), it appears 3D color-per-vertex interpolation can be simulated using appropriate SVG gradients on the SVG elements as illustrated by Figure 3. However, in practice it implies an initial definition of three linear SVG gradients as well as three animation passes per each 3D polygon. Not only does it slow down the SVG rendering too much to allow smooth animations, but it also generates huge unusable SVG files. It is due to the fact that only interpolations between two colors are possible in SVG, thus each SVG polygon has three interpolations. However, this method can be applyied to generate single frames of complex 3D structures of static objects with an acceptable cost. The visual improvement is notable then it allows exceeding the constraint of converting only flat shaded meshes.


SVG source at SimpleMeshWithGouraud.svg

Figure 3: A small triangle mesh where each vertex's color is degraded to connected triangles.

Hence, for a usable conversion, we compute statically a single color per face for animations, as the mean of the colors of its three vertices. On the other hand, we allow 3 colors per face in order to simulate Gouraud shading in the case of generating a static objet. The extraction of other static data is trivial, therefore we detail below the extraction of dynamic data with updated positions and visibility.

Common 3D graphics engines mostly perform two projections out of the ones used to display 3D geometry on a 2D surface, namely the perspective and the orthographic parallel projections. Though the latter would dramatically ease the conversion and would allow for direct 2D animations, it does not achieve an acceptable feeling of depth and volume compared to perspective projections. Geometric projections and transformations are computed in hardware by the graphics pipeline. It speeds up the process and takes into account the camera as well as the objects state at a given frame. In our method, to extract updated 2D data from a 3D animation, we basically render the scene with a standard graphics library to process the projection matrix, the updated 2D coordinates are then computed by multiplying the projection matrix with 3D positions. The rendering loop stage is consequently extended with a gathering of projected vertices data and faces visibility, according to the camera and vertices positions at the current frame. This process is either conducted at each frame, or at a desired frequency to limit the data storage. Actually, it is possible to parameterize the recording framerate to which the output file size is directly related. Not only the rendering loop enables fast computations in hardware, but it also ensures that whatever animation or deformation occurs in 3D, it will be converted to 2D.

View-frustum and backface cullings, using the faces normal and the vector from the viewpoint to a vertex, are performed on the 3D scene to create a visibility list for each polygon at each frame. Occlusion culling would also be possible, but in the case of dynamic partial occlusions, the effects on the SVG file are not reproductible: SVG elements can not be created dynamically, hence it is impossible to tessellate SVG polygons on-the-fly to reflect 3D partial occlusions. As a consequence, cullings are performed on complete faces: at each frame, the culling sets a bit flag to true or false depending on the 3D face visibility, at the end of the recording, this list of flag is processed to reconstruct time ranges for visible and invisible states of a given SVG faces.

This first 2D extraction module is very similar to an interface plugged between the 3D rendering thread and the 2D data cache. Its pseudo-algorithm is:


  1. Initialization of
  2. Update and Extraction

3.3 Generation and Optimization of SVG

From the 2D data extraction process, the converter generates a corresponding output SVG-compliant file.

3.3.1 To Produce a SVG Copy

The SVG generator creates a file according to the 2D data present in the cache. First outputs have been generated using the <polygon> SVG tag. Though it is well suited to define a face, its individual points are not directly animatable. To represent a 3D face, a SVG element capable of animation, deformation, and color filling is necessary. The <polygon> element supports 2D animation and color, but do not provide deformation mechanisms. Therefore, for a 3D polygon, a serie of nbF SVG polygons, where nbF is the number of frames of the animation, are required. Inspired by the approach of Herman et al. in [HHD02], the internal SVG representation used in our method is based on <path>, the most appropriate SVG structure to represent a 3D polygon: it can specify a closed ngon, on which each points are animatable independantly and which visibility can be described at any frame. Positions of paths’ points are directly related to the 2D projected vertices computed during the extraction, and though in our case these coordinates, the xij and yij of Figure 5, are absolute, relative coordinates are also possible and would have a smaller memory footprint. Filling a <path> can also be done with an image, a solid color (in case of a simple flat shaded object), or a linear or radial gradient (linear is used for Gouraud shaded objects). At initialization, the 3D triangles are z-sorted to create an ordered list of paths, then the "visibility" of each path is adjusted by the sequence of bit flags from the extraction module. The structure of a generated SVG file is illustrated by Figure 4, while the structure of a <path> tag is illustrated by Figure 5.


Figure 4: Global structure of a generated SVG file, without color per vertex.


Figure 5: Structure of a single <path>, the SVG element reproducing a 3D triangle

3.3.2 To Optimize Size

Depending on the number of polygons and the number of frames of the animation, generated SVG raw files might be several megabytes big. Note that raw refers here to unzipped files, though gzipped SVGs are playable in standard viewers, we consider only uncompressed sizes to avoid any influence of the Lempel-Ziv based gzip compression algorithm. To reduce the output SVG file size as required, we adress this issue by applying some post-process optimizations, as illustrated by Figure 6. The first straightforward optimization is to delete the animated coordinates data of a <path> when it is not visible. It is done by determining the range of frames where a face is hidden, thus specifying the data to be deleted: assuming a <path> is invisible between frame f0 and frame f0 + x0, its first <animate> tag is updated to finish at f0, it is closed, and an additional <animate> tag is created, starting at frame f0 + x0.


The picture on the left illustrates the visibility optimization, on the right the optimization of static sequences.

Figure 6: The two methods of optimization applyed on SVG animations.

To further optimize the size of the output file, we also determine sequences of frames when the <path> is visible but does not move. Supposing a static <path> from frame f1 to frame f1 + x1, <animate> is stopped at f 1, and a new <animate> tag is defined to create a single keyframed animation between f1 and f1 + x1 using the static position and with a duration corresponding to x1.

All these optimizations basically merge animation data when possible, by adding <animate> tags. The keyTime attribute is another SVG feature one can exploit to perform such optimizations. Unlike <animate>, it does not add keyframes but parameterize the timing of each frame and thus, keyTime is especially relevant in the case of many static frames while less efficient than <animate> in the case of dynamic long sequences. Some experiments on sizes have shown that if the ratio nbF/nbSS > T, where nbSS is the number of static sequences, T is a threshold equals to 200/13, then the optimizations are smaller with <animate> than keyTime.

3.3.3 To Generate Gouraud Shading

At the moment of generating a SVG file, instead of producing an animation with poor shading effect, it is possible to create a SVG file representing the current frame with Gouraud shading, in order to have smooth transitions between triangles of the mesh, and consequently a more realistic rendering. Indeed, if we apply only one color per triangle, the visual result, consisting of flat-shaded triangles, is not very realistic and color-contrast artefacts between polygons are clearly visible, resulting in “facetted objects”. In 3D Computer Graphics, the Gouraud shading is an intensity-interpolation method based on the illumination of vertices. These values are first calculated, then interpolation is done between the three vertex to obtain a gradient. When this operation is repeated in each triangle of the mesh, the result decreases the color-difference between triangles. Our method implements it with the tools and specifications available with a vectorial langage such as SVG.

We use the Gouraud shading simulation in SVG method proposed in [Man02], in a large scale and with some adaptations in order to obtain a satisfying result for 3D projected mesh. This method proposes to apply 3 gradients per triangle (one per vertex) each one filling his respective part. To merge theses gradients, we have to modify spatially the transparency, then the more a gradient is close to the opposite edge of the triangle, the smaller is its alpha value.


The gradient initially covers all the gray rectangle, but the triangle works like a "cache". This operation is repeated 3 times per triangle.

Figure 7: Gradient path of each vertex color.

Because SVG 1.1 Basic does not allow applying several gradients to a triangle, we have to superpose 3 triangles, one for each gradient, with the same coordinates. These triangles are then combined by applying a filter <fe Composite> The result is again filtered with a filter <fe Colormatrix> in order to correct the opacity of the 3 previous layers. Figure 8 shows the complete structure of a triangle.


Three gradients are superimposed using the filter <fe Composite>. This composition is then "posed" on a sub triangle with an average color. The group is finally filtered in order to obtain a correct opacity.

Figure 8: Structure of a Gouraud shaded triangle

Afterwards, an additional triangle is put "under" the composition to ensure hidden-faces by avoiding the coverage of triangles. Indeed, this undercoat triangle has a color that is the average of the 3 vertices colors and the first gradient ("gradient0_0" in the example) is put on this triangle and not on the background or a possible triangle underneath. Then, each polygon is independent and it is not mixed with others lower layers.

The following code shows the rassembling of different elements of a triangle : <use fill="rgb(40.332,28.909,23.867)" fill-opacity="1" xlink:href="#triangle0"> <filter="url(#setAlpha)" ><use fill="url(#gradient0_0)" xlink:href="#triangle0" /><use fill="url(#gradient0_1)" filter="url(#colorAdd)" xlink:href="#triangle0" /><use fill="url(#gradient0_2)" filter="url(#colorAdd)" xlink:href="#triangle0" /> </g>

With such a method, two types of process have to be applied on each triangle: gradients and filters. This option became necessary after seeing that SVG did not allow implementing directly this method of shading. This operation is executed only in the generation module because the SVG produced is very different than a code showing an animation of a flat shaded object.

The SVG files generated are structured as follows:

4. Experiments and Evaluation

To validate our approach, its functionalities and its achievements regarding our initial goals, three testbeds have been set up and are detailed in this Section. The first experiment evaluates the core technologies supporting the method, with a focus on the management of SVG faces visibility and on the possible optimizations to reduce the output file size. The second experiment validates the resulting SVG copies from 3D mesh deformations, using an MPEG-4 3D facial animation as input. The third and last is a test of Gouraud shading in SVG with generating a single image of the face captured during the animation.

PyramidFaceFace with Gouraud
Number of poly8484484
Duration in s12.088.28-
Raw size in kb5295684610
Optimized size in kb914798-
Optimized gzipped size in kb27139663
Generation time in s083
VRML size in kb23131
Animation size in kb32177-

Table 1: Size of the different data in the proposed three experiments

Table 1 summarizes the size of the ouput SVG from these experiments, without and with optimization, it also gives the approximate duration of the generation process while the 2D data extraction is completely real-time. By comparing the size of the smaller SVG, i.e. optimized by our methods and gzipped, it appears that the footprint is smaller for pyramid in SVG than 3D and more important in the case of facial animation (due to the conversion of deformations in positions). Despite this result, it is still significant to use the conversion with respect to all the other limiting factors of 3D playback.


The top value, for a 1000*1000 file, corresponds from left to right to SVG with <polygon>, with <path>, and with <path> and optimizations.

Figure 9: Approximated mean size of SVG according to the number of polygons and the number of frames.

4.1 The Pyramid and Optimizations

To evaluate the framework and the basic processes of our method, test conversions of a simple 3D scene consisting of a 8-poly pyramid model and of user-interactions as input are explored (SVG source at pyramid.svg). In these tests, a 2D rendering window is added to evaluate and certify the 2D extraction process. It appears that both the 3D and 2D window display similar graphics, and therefore the generated SVG file is also visually similar to the original 3D scene. Furthermore, the dynamic culling of 3D faces is successfully translated to the visibility states of <polygon> or <path>. With this model as input and a twelve seconds animation, three SVG have been generated: one using <polygon> as SVG primitives, one with <path>, and one with <path> and visibility optimizations. Their respective sizes are 529kb, 185kb and 55kb. Assuming a constant visibility, the size is directly related to the number of polygons and to the duration of the animation as follows: using <polygon>, the output file size is 286 * nbP * nbF bytes, while using <path> is 352 * nbP + 56 * nbF *nbP bytes, where nbP is the number of 3D faces and nbF the number of frames of the animation to be converted. Figure 9 clearly illustrates the benefit of <path> representations on <polygon>-based ones, as well as the gain obtained from optimizations, which is around 30% in average. Though our scenario is the use of SVG files in local, the particular care of our method on the size is also relevant for the case of encoded SVG streaming by a server. In conclusion, the sample tests on pyramid were successful regarding that:

4.2 The Talking Head and Deformations

One of the main requirements for our method was to have a generic support for animation and deformation. Therefore, the final use case of our conversion tool being explored is the ability to convert a complex 3D animated face, including deformations, into a SVG perfect copy, as illustrated in Figure 12. The quality of the SVG replica is a significant result to the proposed conversion method.

We also applied Gouraud shading module on this animation by generating a SVG image of the current frame in the animation played. Despite the fact that a complete animation cannot still be generated, we obtain satisfying results of Gouraud shading (Figure 10).


The face in its initial state with flat shading (one color per triangle) and Gouraud shading in SVG (three gradients per triangle). SVG source at faceWithGouraud.svg

Figure 10: The face in SVG without and with Gouraud shading

At large scale, with a mesh of several hundreds triangles, this method gives good results. The Gouraud shading is successfully implemented with almost flat-shaded areas dissolved. The differences between triangle’s edges have been strongly polished and then with an equal amount of polyons, the 3D object is more realists.


Better view of the smoothened transitions.

Figure 11: Detail of the right eye

In parallel to these visual results, there are some drawbacks of the method. First, it is not possible to generate SVG animated files in Gouraud because of the amount of SVG code required to apply gradients and filters and of the requested processing time for rendering. For example, a single triangle Gouraud shaded requires:

Triangle and gradients:965 bytes
Regrouping of elements:287 bytes
Total:1252 bytes
File size for one frame:610 KB (484 triangles)
For a 5 seconds sequences:178 MB (25 im/s)

Table 2: Cost of Gouraud shading

Such files would not be acceptable for the devices we target. Extending SVG definitions and specifications could help to reduce this amount, by allowing muliple gradients for a polygon. Actually, the cost should be smaller than by using multiples layers of triangles and multiples filters. Another problem is that the borders of the triangles are not totally hidden, light gridded aspect remains. It is more visible when an isosceles triangle is next to a stretched triangle. In spite of drawbacks that limit its application, this method gives good results for static 3D objects and would be more efficient with more appropriates SVG tags.

5. Conclusion

Though our method is complete regarding its initial goals and uses, many improvements are possible in the following directions, ordered by importance to us. First the size of the output should be further reduced, e.g. optimizations by interpolating the positions of each paths’ points could be investigated rather than frame-based optimizations. Second, 3D rendering features that are supported by the conversion engine could be extended with improved shading, by finding solutions to the current VG gradient processing. For texture mapping, the exploration of other VG formats, such as MPEG-4 BIFS, refer to Concolato et al. [CMD03] for more details, that supports texturing would extend the possibilities of our conversion. Macromedia Flash formats would also be other possible alternatives to investigate. Third, to increase the speed of this system, another direction to follow could be the use of graphics hardware as a processing unit for some of the operations and steps that occur in the conversion. For instance the culling algorithms would probably be good candidates to be processed by a GPU at the vertex level.

We have presented a method to convert large, animated and deformable 3D scenes to 2D SVG in order to display 3D animations on the lightest devices as well as on web applications, and in order to extend authoring tools and methodologies for VG. Considering the previous work in that field, the contribution of our work is the support for complex and deformable meshes and the specification of a complete generic framework for the conversion, as demonstrated by the experiments.


SVG source at faceAnimation.svg

Figure 12: Results of a copy SVG facial animation


This research has been funded through the European Project DANAE IST-1-507113. The authors would also like to thank Lionel Egger and Stephane Garchery for their respective help on 3D models and facial animation.


AUBEL A., BOULIC R., THALMANN D.: Real-time display of virtual humans: Level of details and impostors. IEEE Trans. Circuits and Systems for Video Technology, Special Issue on 3D Video Technology 10, 2 (2000).
BELZ C., JUNG H., SANTOS L., STRACK R., LATVA-RASKU P.: Handling of 2d-3d graphics in narrow-band mobile services. In Proc. From Desktop to Webtop: Virtual Environments on the internet WWW and Networks (1997).
CONCOLATO C., DUFOURD J. C., MOISSINAC J. C.: Encoded of cartoons using mpeg-4 bifs. IEEE CSVT, Special Issue on Image-Based Modeling, Rendering and Animation 13, 11 (Nov. 2003), 1129–1135.
CHANG C., GER S.: Enhancing 3d graphics on mobile devices by image-based rendering. In Proc. 3rd IEEE Pacific-Rim Conference on Multimedia (2002).
CONCOLATO C., MOISSINAC J. C., DUFOURD J. C.: Representing 2d cartoons using svg. In Proc. SMIL European Conference ’03 (2003).
DÉCORET X., DURAND F., SILLION F., DORSEY J.: Billboard clouds for extreme model simplification. In Proc. Siggraph ’03 (2003).
DIGIACOMO T., JOSLIN C., GARCHERY S., MAGNENAT-THALMANN N.: Adaptation of facial and body animation for mpeg-based architectures. In Proc. IEEE CyberWorlds ’03 (2003).
FERRAIOLO J., JUN F., JACKSON D.: Scalable vector graphics (svg) 1.1 specifications. In W3C Recommendation, (2003).
HOPGOOD B., DUCE D., HOPGOOD P.: Using xslt and svg in teaching: 3d, sound and nostalgia. In Proc. SVG Open ’03 (2003).
HERMAN I., HOPGOOD B., DUCE D.: Svg: Scalable vector graphics, tutorial notes. In WWW2002 Conference (2002).
HOPPE H.: Progressive meshes. In Proc. Sig-graph ’96 (1996).
KOLLI G., JUNKINS S., BARAD H.: 3d graphics optimizations for arm architecture. In Proc. Game Developers Conference ’02 (2002).
LIN J.: 3D Web Graphics without Plugin using VML. Master’s thesis, San Jose State University, 2003.
LAMBERTI F., ZUNINO C., SANNA A., FIUME A., MANIEZZO. M.: An accelerated remote graphics architecture for pdas. In Proc. Web3D ’03 Symp. (2003).
Mansfield P.A. : Common graphical object models and how to translate them to SVG. In SVG Open (2002) Paper
MANSFIELD P., OTKUNC C.: Adding another dimension to scalable vector graphics. In Proc. XML Conference and Exposition ’03 (2003).
MACIEL P., SHIRLEY P.: Visual navigation of large environments using textured clusters. In Proc. ACM Symp. Interactive 3D Graphics (1995).
OTKUNC C., MANSFIELD P.: Interactive 3d viewer written in svg. In Proc. SVG Open ’03 (2003).
PREDA M., PRETEUX F.: Advanced animation framework for virtual character wbithin the mpeg-4 standard. In Proc. IEEE ICIP ’02 (2002).
RAIN(R) E.: Swif.t3d® v3, 2003.
SANGTRAKULCHAROEN P.: 3D to SVG Translator. Master’s thesis, San Jose State University, 2003.
SILLION F., DRETTAKIS G., BODELET B.: Efficient impostor manipulation for real-time visualization of urban scenery. In Proc. Euro-graphics ’97 (1997).
SCHAUFLER G., STURZLINGER W.: A three-dimensional image cache for virtual reality. In Proc. Eurographics ’96 (1996).
STAM J.: Stable fluids. In Proc. Siggraph ’99 (1999), pp. 121–128.
TECCHIA F., LOSCOS C., CHRYSANTHOU Y.: Image based crowd rendering. IEEE Computer Graphics and Applications 22, 2 (Mar. 2002).

XHTML rendition made possible by SchemaSoft's Document Interpreter™ technology.