Leveraging SVG in the Rigi Reverse Engineering Tool

Holger M. Kienle, Anke Weber, and Hausi A. Müller
Computer Science Department
University of Victoria
Canada
e-mail: kienle@cs.uvic.ca, anke@cs.uvic.ca, and hausi@csr.uvic.ca
fax: +1-250-721-7292
webpage: http://www.rigi.csc.uvic.ca/

Keywords: software reverse engineering, graph visualization, SVG, component reuse, ECMAScript

Abstract

Reverse engineering of legacy software systems is a research area that heavily relies on visualization techniques to facilitate program understanding. Rigi is a software reverse engineering tool based on graph visualizations. Its implementation (based on Tcl/Tk) is outdated and exhibits several drawbacks. In this paper we discuss why SVG is well suited for reverse engineering visualizations and how we utilize SVG to make Rigi more useful to reverse engineers.

Introduction

Reverse engineering of legacy software systems is a research area that draws on visualization techniques to facilitate program understanding for software maintenance activities (e.g., with call graphs), and to communicate software architecture and design (e.g., with graphs and diagrams). These visualizations become an important part of the system and maintenance documentation. The results of our work suggest that Scalable Vector Graphics (SVG) is well suited for reverse engineering visualizations. This paper presents our approach to leverage SVG for reverse engineering and system documentation activities.

Reverse Engineering

Chikofsky and Cross define reverse engineering as follows [1]: "Reverse engineering is the process of analyzing a subject system to (1) identify the system's components and their interrelationships and (2) create representations of the system in another form or at a higher level of abstractions." The second part of the process often involves visualization to communicate complex information about the software system more effectively. Note that reverse engineering is an approach to understanding the system as opposed to changing the system.

There are many different ways to visualize a subject system. However, a graph model offers a very intuitive way: Components of the system are represented as nodes and interrelationships between components are represented as (directed) arcs. A simple example is a call graph for a program. Procedures in the program constitute the nodes of the call graph. Calls between procedures are represented with arcs. In a sense, software structure graphs, such as call graphs, are aerial maps for software engineers.

Often, graphs contain different kinds of nodes and arcs. For example, the same graph might show procedures and global variables contained in a program. Typically, different kinds of nodes are visualized with different colors and/or shapes. Since legacy systems tend to be huge (several million lines of code), the graphs can contain thousands of nodes and arcs with different kinds of node and arc types.

Rigi

Rigi is a reverse engineering tool developed over more than ten years at the University of Victoria [2]. It uses the graph model described above. The core of Rigi is a graph editor/drawing tool enhanced with additional functionality for reverse engineering tasks.

A Rigi Graph

Figure 1: Software Structure Graph in Rigi

Figure 1 depicts a graph that visualizes parts of IBM's SQL/DS system, which is over one million lines of PL/AS code [2]. Red nodes are variables, yellow nodes are modules, and purple nodes are types. All arcs have been filtered out except for the yellow ones, which represent calls between modules (i.e., Figure 1 shows the call graph). The full graph contains 923 nodes and 2395 arcs, but in this filtered view only 189 yellow arcs are actually visible.

The graph visualization and manipulation are implemented with Tcl/Tk. This implementation exhibits several drawbacks. The rendering speed is rather slow and does not scale up well for larger graphs. Furthermore, the GUI's look and feel is perceived as rather crude compared to the state of the art. Lastly, images cannot be exported (except for taking screenshots). The last point is a severe drawback, because Rigi graphs are the main result of the reverse engineering activity, which become essential system documentation [2]. Even if Rigi screenshots are cumbersomely integrated into the system documentation, they are mere static bitmaps that allow no interactive exploration.

SVG for Reverse Engineering

Intuitively, SVG seems a good candidate for reverse engineering visualizations. However, a more formal approach to validate this hypothesis is needed. This section discusses in more detail how SVG meets the requirements of reverse engineering tools.

Figure 2: Reverse Engineering Requirements (R1-R7)
Requirement Description
R1 Address the practical issues underlying reverse engineering tool adoption.
R2 Provide interactive, consistent, and integrated views, with the user in control.
R3 Provide consistent, continually updated, hypermedia software documentation.
R4 Integrate graphical and textual software views, where effective and appropriate.
R5 Use a simple, human-readable, lightweight format.
R6 Support introspection of schemas.
R7 Support multiple, composable, modular, dynamically extensible schemas.

In his Ph.D. thesis, Kenny Wong identified reverse engineering tool requirements [5]. Figure 2 lists his relevant requirements in the context of this paper. Rigi falls short in several of these requirements. Leveraging SVG within Rigi helps to alleviate several of Rigi's shortcomings. The following list discusses the features of SVG that facilitate reverse engineering. We also identify the requirements (R1-R7) that are addressed by SVG:

Component Reuse

Another important reverse engineering tool requirement is reuse of existing tools and components. It is beneficial to develop reverse engineering tools that leverage existing components (COTS) and technologies. The benefits are twofold: On the developer's side we have an increased level of reuse; on the user's side, we offer a familiar environment to the user.

Reverse engineering tools tend to be developed without much reuse. The Rigi tool achieves a modest level of reuse by utilizing Tcl in its scripting layer and Tk for its GUI implementation. Primitive Tk and X11 drawing routines are used to render the graph. Standard functionality such as loading/saving, zooming, and scrolling have to be custom-implemented. This resulted in a rather crude and idiosyncratic look-and-feel of the GUI that can be frustrating to use.

With SVG we can achieve a higher level of reuse. SVG offers sophisticated, high-level graphic primitives. Interactive behavior of graphical objects can be easily achieved with scripting based on event notification. This allows the implementation of graph editors and visualizers without having to reinvent the wheel.

SVG comes with high-quality visualization engines such as the Adobe SVG viewer and Batik. These viewers already provide functionality for scrolling, zooming, loading/saving, and searching. In a sense, SVG viewers are reusable components based on XML technology. As discussed in Section SVG for Reverse Engineering, the Adobe SVG viewer can be embedded into various host tools (e.g., Microsoft Office). These host tools typically offer a familiar environment to the user because they are used on a daily basis. Thus, users develop a high expertise at these tool and often customize them to fit their specific needs. McRenere et al. note that "the overlap between the command vocabulary of different users is minimal, even for users in the same group who perform similar tasks and who have similar computer expertise" [16].

For example, users of PowerPoint know how to effectively get an overview of the existing slides, jump to a specific slide, or reorder slides. This knowledge has been accumulated during tool usage by each individual user. In fact, different user have different strategies how to achieve a certain goal such as reordering a slide. They might have rearranged the tool bar to easier access functionality that they tend to use often. They might even have developed VisualBasic scripts. If PowerPoint users embed SVG documents into PowerPoint, they can use PowerPoint's functionality to organize the SVG documents, annotate them, etc. Offering the user a familiar environment also helps tool adoption (Requirement R1 in Figure 2).

Related Work

As discussed in the previous Section Component Reuse, Reverse engineering tools should leverage the existing user expertise by reusing the supportive features of tools that are already in use rather than re-implementing them for each new tool. Other research has chosen a similar approach to ours in building on top of existing tools. However, this research typically emphasizes the benefits for the developer, but fails to address the potential benefits for the user.

The Visual Design Editor (VDE) [12] is a domain-specific graph editor implemented with VisualBasic on top of PowerPoint. Both VDE and Rigi are graph editors. Both render graphs and offer graph manipulations. However, they have different characteristics. VDE targets smaller graphs (i.e., no more than 100 nodes) that are interactively constructed, whereas Rigi has to deal with bigger graphs (i.e., thousands of nodes) that have to be automatically laid out.

VDE personalizes PowerPoint with new pull-down menus and icons. The authors state: "PowerPoint offers a highly functional GUI for interactively designing presentation graphics. Virtually every part of that GUI is useful, without modification, as part of our design editor." Among the reused PowerPoint functionality is scrolling, zooming, loading/saving, cut/copy/paste, and printing. Often only these standard functionalities can be reused. VDE achieves a higher level of reuse because PowerPoint internally uses a graph model. PowerPoint graphical objects are VDE graph nodes and PowerPoint connectors (i.e., lines that attach to other objects) are VDE arcs. Thus, VDE can also reuse operations on objects and connectors (such as deletion, selection, grouping, and aligning).

The BOX system realizes a UML diagram viewer based on Internet technology [3]. The authors state that it is "a portable, distributed and interoperable mechanism to browse software engineering documents using standard off-the-shelf browser technology." The implementation uses Dynamic HTML and the Vector Markup Language (VML) rendered on Internet Explorer. The UML model is exported from Rational Rose as XMI file. The XMI file is then converted to a Web page with an embedded VML document.

VML is a predecessor to SVG and both formats share many commonalities. The authors state their intention to port their tool from VML to SVG. The paper also discusses benefits of vector graphic markup languages compared to bitmap images (e.g., for text-based searches).

SVG Integration in Rigi

Because of SVG's benefits outlined above, we decided to leverage it to overcome Rigi's current limitation in visualization and document generation. Figure 1 is a typical reverse engineering result in Rigi. This graph is a document that we would like to easily incorporate into other host environments for presentation and documentation purposes (e.g., Internet Explorer, PowerPoint, and Excel). Previously, this could be accomplished only by taking a static screenshot and importing of the resulting bitmap into the target document.

In our SVG-enabled implementation, the user can export a Rigi graph as an SVG document. The user can either write the SVG document into a file or automatically launch a Web browser to display the corresponding graph. Figure 3 shows the Rigi environment on the left side (workbench and sample graph) and the exported SVG document in PowerPoint (top right) and Internet Explorer (bottom right).

Screenshot

Figure 3: Exported Rigi Graph in PowerPoint (top right) and Internet Explorer (bottom right)

Export of the SVG graph is accomplished with Rigi's built-in scripting capabilities. Rigi can be programmed with the Tcl scripting language, which provides an extensible core language and was designed to be embedded into interactive windowing applications. Via Tcl, Rigi's internal graph data structure can be conveniently accessed and manipulated. We wrote a Tcl script with about 200 lines of code that adds an additional pull-down menu to Rigi's GUI (see Rigi workbench on top of Figure 3), and extracts the relevant information from the internal graph data structure and outputs it into an intermediary Rigi View Graph (RVG) file. The information in the RVG file is then transformed to an SVG document. This is accomplished with a Perl script (rvg2svg) of about 650 lines of code.

This two-stage design with RVG as the mediator decouples output generation from Rigi (see Figure 4). On the one hand, different reverse engineering tools can export their visualization data as RVG file and invoke the stand-alone Perl script to generate SVG. On the other hand, different visualizations (e.g., dot graphs [13]) can be generated from the same RVG file. We chose Perl to generate SVG from RVG because it has convenient string manipulation facilities and is suitable for rapid prototyping.

Figure 4: SVG Document Export in Rigi

The translation of the graphical elements from a Rigi graph to an SVG document is straightforward. Rigi nodes are translated to SVG <circle> elements, arcs are translated to <line> elements. The graph's legend (node types on the left and arc types on the right side) are <text> elements. The following code shows an SVG snippet that defines an arc along with the arc's source and destination node:

<g id="arcs" style="stroke-width:1">
 <line id="1052134"
  x1="5304" y1="5133" x2="3123" y2="5033"
  rigitype="0" src="1048650" dst="1048597"
  style="stroke:rgb(255,255,0);"/>
 ...
</g>
...
<g id="nodes"
   onmouseover="DoOnMouseOverNode(evt)"
   onmouseout="DoOnMouseOutNode(evt)"
   font-size="135"
   style="stroke:black;stroke-width:1;opacity:1.0;">
 ...
 <circle id="1048650" rigitype="3"
  rigiarcs="1052133,1052134,1052115,1052510"
  cx="5304" cy="5133" r="40"
  style="fill:rgb(255,255,0);"/>
 ...
 <circle id="1048597" rigitype="3"
  rigiarcs="1052231,1052116,1052134,1052879,1054935"
  cx="3123" cy="5033" r="40"
  style="fill:rgb(255,255,0);"/>
 ...
</g>

Every element has a unique "id" and contains several non-standard attributes. These are used by the embedded ECMAScript code. For example, the <line> elements contain the non-standard attributes "src" and "dst" that identify the source and destination nodes of the arc, respectively.

The rvg2svg script uses Ronan Oger's SVG-2.0 module [7]. The module provides an API to create SVG elements along with their attributes, to nest them properly, and to finally output indented XML code. The following Perl snippet shows how graph nodes are translated to SVG <circle> elements:

  # include Oger's SVG module
  use SVG;

  # instantiate new SVG document
  my $root = SVG->new( -indent => '  ',		   
                      ...
		      onload => "DoOnLoad(evt)" );
  ...

  # create a SVG group element that contains the graph nodes
  my $nodes_group =
    $root->group ( id => "nodes",
		   onmouseover => "DoOnMouseOverNode(evt)",
		   onmouseout => "DoOnMouseOutNode(evt)",
		   "font-size" => $font_size,
		   style => "stroke:black;stroke-width:1;opacity:1.0;" );

  # iterate over all nodes in the Rigi graph and
  # generate SVG <circle> elements for each node
  my $node;
  foreach $node ( keys %{$rvg->{nodes}} ) {
    my $x = $rvg->{nodes}{$node}{x};
    my $y = $rvg->{nodes}{$node}{y};
    ...

    # create SVG <circle> element
    my $circle =
      $nodes_group->circle( id => $node,
			    cx => $x,
			    cy => $y,
                            ... );
  }

  ...

  # output XML
  print $svg->xmlify( "-in-line" );

Besides Oger's SVG-2.0, there are several other Perl modules for SVG document generation available [6][11].

We use embedded ECMAScript to provide interactive behavior of SVG documents for the reverse engineer. Figure 5 is an embedded SVG document that can be used to explore the features described below:

This functionality is implemented with about 270 lines of ECMAScript code.

Figure 5: Rigi Graph in SVG

SVG Experiences

This section offers some general observation that we made during our SVG development. We also compare SVG graphs with the original Rigi graphs.

Future Work

Our current approach uses a converter that directly generates SVG text and basic shapes elements to render the graph. However, it is also possible to dynamically generate these SVG elements with ECMAScript (right after the document has been loaded). For example, Adobe's Chemical Markup Language Demo uses this approach [14].

It is an important design choice which SVG elements to generate statically in the XML document and which to instantiate dynamically with ECMAScript. Both design choices have different trade-offs that have to be carefully considered. For example, with our current, static approach, the graph is visible with SVG viewers that do not have scripting support. The information that the non-interactive graph provides is still useful for the reverse engineer. With the dynamic approach, the user would see an empty (or partially drawn) document that is of no use. To investigate the involved trade-offs in more detail, we plan another converter that uses the dynamic approach.

We also plan to further enhance the SVG exporter. Especially, we want to make the SVG documents more interactive by incorporating additional graph manipulation functionality (e.g., moving of nodes).

For historical reasons, our RVG file format (see Section SVG Integration in Rigi) is text based. To better leverage XML-based technology, we plan to define an XML-based format. This would allow us, for example, to use XSLT to transform Rigi graphs to SVG.

Conclusions

Our experiences with SVG have been largely positive. SVG's features allowed us to generate useful graphs for the reverse engineering domain that are visually appealing. These graphs are interactive and embeddable into host applications such as Web browsers and Microsoft Office. Interactive SVG graphs are a first step towards live documents. Embeddable SVG graphs are important for document generation and presentation purposes.

Acknowledgments

Thanks to Jon Pipitone for helping with the implementation of rvg2svg. Thanks to Crina Vasiliu for proofreading an earlier version of the paper.

References

[1] Elliot J. Chikofsky and James H. Cross II, "Reverse Engineering and Design Recovery: A Taxonomy", IEEE Software, pages 13-17, January 1990.

[2] Kenny Wong et al., "Structural Redocumentation: A Case Study", IEEE Software, pages 46-54, January 1995.

[3] Christian Nentwich, Wolfgang Emmerich, Anthony Finkelstein, and Andrea Zisman, "BOX: Browsing objects in XML", Software–Practice and Experience, 30(15):1661-1676, December 2000.

[4] Lloyd Rutledge, "Multimedia standards: Building blocks of the web", IEEE MultiMedia, 8(3):13-15, July-September 2001.

[5] Kenny Wong, "The Reverse Engineering Notebook", Ph.D. Thesis, Department of Computer Science, University of Victoria, 1999.

[6] SVG-PL Official Homepage, http://cs.nott.ac.uk/~jxm/SVG.

[7] SVG-2.0 at CSPAN, http://search.cpan.org/search?dist=SVG.

[8] ToX Project, http://www.cs.toronto.edu/tox/.

[9] Ric Holt, Susan E. Sim, and Rainer Koschke (editors), ICSE 2000 Workshop on Standard Exchange Format, http://www.cs.toronto.edu/~simsuz/wosef/workshop.html.

[10] Luca Bompani, Paolo Ciancarini, and Fabio Vitali, "Software Engineering and the Internet: a Roadmap", The Future of Software Engineering, ICSE 2000, pages 305-315, 2000.

[11] SVG modules at CSPAN, http://search.cpan.org/search?mode=module&query=SVG.

[12] Neil M. Goldman and Robert M. Balzer, "The ISI Visual Design Editor Generator", IEEE Symposium on Visual Languages (VL '99), pages 20-27, September 1999.

[13] Graphviz - open source graph drawing software, http://www.research.att.com/sw/tools/graphviz/.

[14] Developer track: Chemical Markup Language (CML) demo, Adobe SVG Zone, http://www.adobe.com/svg/demos/devtrack/chemical.html.

[15] R. C. Holt and A. Winter, "A Short Introduction to the GXL Exchange Format", Proceedings 7th Working Conference on Reverse Engineering (WCRE 2000), Panel on Reengineering Exchange Formats.

[16] McGrenere, J., Baecker, R., and Booth, K, "An evaluation of a multiple interface design solution for bloated software", ACM CHI 2002.

[17] Anke Weber, Holger M. Kienle, and Hausi A. Müller, "Live Documents with Contextual, Data-Driven Information Components", Proposal accepted for the Annual ACM Conference on Systems Documentation, SIGDOC 2002, October 20-23, 2002, Toronto, ON, Canada.


Holger M. Kienle received his Master of Science degree in Computer Science from the University of Massachusetts Dartmouth (1995) and his Diploma in Computer Science from the University of Stuttgart, Germany (1999). He is currently a Ph.D. student in Computer Science at the University of Victoria, Canada, where he is a member of Professor Müller's Rigi group. His interests include software reverse engineering, programming languages, program analyses, and domain-specific languages.

Anke Weber is a Research Associate in the Department of Computer Science at the University of Victoria, Canada. She is currently exploring the potential of applying SVG technology for keeping systems documentation in software engineering environments up-to-date. Her further research interests include adoption-centric software engineering tools and human-computer interaction.

Dr. Hausi A. Müller is a Professor in the Department of Computer Science at the University of Victoria, British Columbia, Canada. He is a Visiting Scientist with the Centre for Advanced Studies at the IBM Toronto Laboratory and the Carnegie Mellon Software Engineering Institute. He is a principal investigator of CSER, a Canadian Consortium for Software Engineering Research. Together with his research group, he investigates technologies and methods to build adoption-centric software engineering tools and to migrate legacy software to object-oriented and network-centric platforms. Dr. Müller was General Chair of ICSE-2001 in Toronto, the IEEE/ACM International Conference on Software Engineering.


Valid XHTML 1.0!