Generating SVG Graphs and Charts from Database Queries

Keywords: SVG, web service, SQL, multmedia

Graciela González
Assistant Professor
Sam Houston State University
Department of Computer Science
Huntsville
TX
U.S.A.
csc_ghg@shsu.edu

Biography

Graciela Gonzalez is an Assistant Professor of Computer Science at Sam Houston State University. Her main areas of research include visualization and modeling of multimedia information, web modeling, human-computer interaction, theories of actions and logic programming. She has authored papers published in conferences such as ACM Multimedia, IEEE, ICDE and ICMCS, and IFIP's VDB6, among others. She received her M.S. (1994) and PhD (2000) in Computer Science from the University of Texas at El Paso.

Gaurav Dalal
M.S. student
Sam Houston State University
Department of Computer Science
Huntsville
TX
U.S.A.
dalal@shsu.edu

Biography

Gaurav Dalal is a M.S. student in the department of Computer Science at Sam Houston State University, Huntsville, TX. He has received an undergraduate degree in accounting (1999) and has done a year of graduate coursework in management from the University of Bombay. His areas of research include databases, multimedia, MIS and XML technologies. He is a Teaching Assistant for his department and also developing his degree project, "Charon", a multimedia-authoring web service.


Abstract


We present a web service that allows end-users to specify a database query and visualization of the data as charts and graphs using SQL+D [SQL+D] as a front-end. SQL+D is an extension of SQL that allows in-the-query specification of multimedia presentations from the resulting answer set. In particular, we present here two special display classes of SQL+D [SQL+D] displays: graphs and charts, for which the output is a standard SVG [SVG] file that can be stored or displayed using any SVG-enabled browser.


Table of Contents


Introduction and Motivation
Charon: Architecture
Examples of Dynamic Displays generated by Charon
     Charts in SQL+D
         Bar Chart
         Pie Chart
         Line Chart
     Graphs in SQL+D
Related Work
Bibliography

Introduction and Motivation

Data by itself is worthless. When data is pieced together, and analyzed for meaning, then it turns into information. It is then information derived from data what is of value. However, analyzing data is not always an easy task. Discovering patterns and tendencies at a glance from a whole collection of numbers in rows and columns might be extremely difficult. However, if the same numbers are displayed as a bar, line or pie chart, then information starts to emerge: we can see at a glance that the popularity of a certain show has been steadily decreasing for the past 6 months, or that sales in the Northwest Houston office are at least twice than in others, or the oscillations of stock market prices. The information contained in such charts, when it is presented in a “spreadsheet” fashion, does not convey the relationship among the different pieces of information as clearly as the chart itself. Similarly, a company’s organization or the intermingled air routes of an airline’s scheduled flights will be dull and very difficult to understand if not presented as a graph, where each node represents a city, and an edge between two nodes means there is a direct flight between the two cities.

We present a web service that allows end-users to specify a database query and visualization of the data as charts and graphs using SQL+D [SQL+D] as a front-end. In [SQL+D] [SQL+D] , we proposed the first version of an extension to SQL, SQL+D that allows users to dynamically specify how to display answers to queries posed to multimedia databases. It provides tools to display multimedia data plus other traditional GUI elements such as boxed text, checkbox, list, and buttons. An extended version was presented in [SQL+D2] [SQL+D] , allowing for dynamic generation of bar and pie charts and directed and undirected graphs as a visualization aid for extracted data, among other features for advanced temporal presentations.

SQL+D was specifically designed for querying multimedia databases. It emphasizes in-the-query, by-the-user specification of the display of the output data. SQL+D is a language extension to SQL, rather than an application, our intention being to maintain the flexibility that allowed SQL to become an almost universal query language for relational databases.

In this paper we concentrate on two special display classes of SQL+D: graphs and charts. The user specifications are used to obtain an adequate answer set and format it according to the given specifications. For graphs and charts, the output is a standard SVG [3] file that can be stored or displayed using any SVG-enabled browser.

Consider for example a database to keep statistical information about the total sales per month, and want to display it as a barchart, with SALES (month, year, total). To see a chart with the appropriate information for 2003, we might submit

Display a categorized bar chart of monthly sales
SELECT * , STDDEV(total) AS deviation
FROM SALES
WHERE year = “2003”
DISPLAY PANEL main
WITH (month, total) AS barchart ON main.Center
  (month, deviation) AS linechart ON main.Center(overlay)
  “Sales Through the Year” AS text ON main.North

Table 1

example1.jpg

Charon: Architecture

Charon is a web service that consists of four main components: an Output Gateway, a Database Interface, a Display Generator, and a Merger. As a web service, it is intended to be installed and running on a server, for access by applications across the web. We outline the overall organization of the system next.

After access to the database is secured, a client request takes the form of an SQL+D query. The Output Gateway is the active class that directs the data flow of the web service, taking requests from clients, routing them into the system, and returning responses to the client. The SQL+D query is divided into the SQL portion (the query) and the display specification portion. The query is sent to the database by the Database Interface, and the display specifications are transformed by the Display Generator into an XSL document. The Merger combines the output from the two and generates a SMIL presentation, which is then returned to the client by the Output Gateway.The table Figure 2 shows an overview of the architecture.

charon_arch2.jpg

Charon Architectural Overview

Description of the Data Flow
  1. The Output Gateway receives the formatted output request containing the SQL+D query and database access information (user nid and password, if required), among other parameters, from the Client.
  2. The Database Interface subcomponent is initialized using the database access information and the SQL part of the query.
  3. The Database Interface connects to the remote database using the database access information and queries the database using the SQL query.
  4. The query result is formatted as a flat XML document, with each record formatted as an XML tag and each of the fields as sub-tags within itself. Inside the Database Interface module, multimedia attributes such as running time, image resolution and others are added to each element as attributes.
  5. The resulting (enriched) XML file is returned and will be used as an input for the Merger.
  6. The Display Generator subcomponent is initialized using the +D part of the query (that is, the display layout structure and temporal constraints for the presentation). A XSL-based display specification document is created based on the structure of the output document requested (SMIL, in the case we are presenting, but the system is prepared to include other output formats). The subcomponent translates the semantics of the spatial-temporal constraints into hierarchical XSL tags.
  7. The XSL display specification file is returned and will be used as the second input for the Merger.
  8. The Merger subcomponent transforms the requested output document by combining the XML and XSL documents input documents from the Database Interface and the Display Generator respectively. The merging process is the same as that of any XSLT application.
    There are certain display elements that cannot be defined in SMIL but are supported in SQL+D, such as charts and graphs. They are generated using SVG, and then included in the SMIL presentation. The Merger generates SVG files by invoking an XSLT class for charts, and through the use of an external web service (called dot) for graphs.
    The Merger initializes these programs by providing the XML file and the required attributes as input.
  9. Each graph or chart generated is kept as a separate file, showing a link in the SMIL presentation for the query. See Figure 3
  10. This formatted output is sent to the Output Gateway.
  11. The Output Gateway returns the formatted document to the Client in the form of an Output Response for displaying.
 
<Data>
	<File>
		<Name>smil1.smil</Name>
		<contents>
			<smil>
				<head>
					<meta content="SMIL" name="title"/>
					<layout>
						<root-layout height="400" width="400"/>
						<region height="400"id="main" width="400"/>
					</layout>
				</head>
				<body>
					<seq>
						<img src="bar1.svg">
					</seq>
				</body>
			</smil>
		</contents>
	</File>
	
	<File>
		<Name>bar1.svg</Name>
		<contents>
		<svg>
			<rect fill="red" height="140" stroke="black" width="12" x="25" y="110.0000"/>
			:
		</svg>
	</contents>
	</File>
</Data>
					

Figure 3: Formatted Output

Examples of Dynamic Displays generated by Charon

This section illustrates four examples of SQL+D queries for which a display was generated. Charts in SQL+D are constructed by passing the attributes of the resultset as input. Charon runs a external application to get the SVG image from SQL+D.

Charts in SQL+D

Bar Chart

To generate Bar Charts, Charon passes the size, the data field and the text field which are database attributes as input. Consider a database to keep statistical information about the sales of canned tomatoes in different months, expressed as a percent deviation from the mean incidence, as SALES (month, year, value, deviation).

To see a chart with the appropriate information for 2003, we might submit the SQL+D query shown in Paragraph 34 .

In this case, we take the size to be 400, the data field to be 'Sales. total', and the text field to be 'Sales.month'. The algorithm for generating the bar graph is explained below:

  1. Generate the X-axis given the current size.
  2. Generate the Y-axis given the current size.
  3. Generate each bar using the 'rect' element defined in SVG. This section of code is shown below
    Element n4 = doc.createElement("rect"); n4.setAttribute("x", "{" + Integer.parseInt(size) / 10 + "*@Index}"); n4.setAttribute("y", "{./" + datafield + "}"); n4.setAttribute("width", String.valueOf(Integer.parseInt(size) / 20)); n4.setAttribute("height", "{" + size + "- ./" + field + "}"); n4.setAttribute("fill", "red"); n4.setAttribute("stroke", "black");
  4. Generate the text label for each bar

SELECT * , STDDEV(total) AS deviation FROM SALES WHERE year = "2003" DISPLAY PANEL main WITH (month, total) AS barchart ON main.Center, (month, deviation) AS linechart ON main.Center (overlay) "Sales Throughout the Year" AS text ON main.North

The resulting display can be shown in Figure 4 .

chart.jpg

Pie Chart

To generate Pie Charts, Charon passes the size, the data field and the text field which are database attributes as input. Consider a database to keep statistical information about the primary occupations of people in various states. The schema looks as follows: OCCUPATION (state, occupation, population).

To see how the residents of California earn their income, we can use Paragraph 38 The resulting display is shown in Figure 5

SELECT OCCUPATION.occupation , OCCUPATION.population FROM OCCUPATION WHERE OCCUPATION.state = "California" DISPLAY PANEL main WITH (activity, pop) AS piechart ON main.Center

In this case, we take the size to be 400, the data field to be 'Sales. total', and the text field to be 'Sales.month'. The algorithm for generating the bar graph is explained below:

  1. Generate the X-axis given the current size.
  2. Generate the Y-axis given the current size.
  3. Generate each slice using the 'path' element defined in SVG. We basically draw the path with a line and an arc and close the path. The main part of the code is to determine where the endpoints of the arc are.
  4. Generate the legend for each slice

piechart.jpg

Pie Chart

Line Chart

Consider a database to keep statistical information about the daily prices of the stocks of various companies. The schema is as follows: PRICES (date, high, low, close).

To see how the stock of XYZ company did over a week , we can use the following query Paragraph 47 The resulting display is shown in Figure 6 .

SELECT * FROM PRICES WHERE DATE => "05/28/2003" AND DATE <= "05/02/2003" DISPLAY PANEL main WITH (date, high, low, close) AS linechart ON main.Center

To generate Line Charts, Charon passes the size, the data field and the text field which are database attributes as input. In this case, we take the size to be 400, the data field to be 'high','low','close', and the text field to be 'date'. The algorithm for generating the bar graph is explained below:

  1. Generate the X-axis given the current size.
  2. Generate the Y-axis given the current size.
  3. Divide the size by the number of tuples to get a X-value offset. Divide the size by the maximum value to get a Y-value offset.
  4. For each data line, Generate each bar using the 'line' element defined in SVG. This section of code is shown below
    for(int i = 1 ; i <= datacount;i++) { Element n4 = doc.createElement("line"); n4.setAttribute("x1", Integer.parseInt(Xoffset*i)); n4.setAttribute("x2", Integer.parseInt(Xoffset*(i+1))); n4.setAttribute("y", ); n4.setAttribute("y1", Integer.parseInt("{./" + datafield + "["+i+"]}"*Yoffset)); n4.setAttribute("y1", Integer.parseInt("{./" + datafield + "["+(i+1)+"]}"*Yoffset)); n4.setAttribute("stroke", "black"); }
  5. Generate the legend for each line.

linechart.jpg

Line Chart

Graphs in SQL+D

Graphs in SQL+D are constructed by defining the values of the nodes and edges by means of the attributes of the resultset. Charon runs a graph-generating application called GraphViz Paragraph 64 to get the SVG image from SQL+D. GraphViz utilizes a graph description format called dot for generating the graph.

Consider a database containing class ids and prerequisites for each career, and other information about the curriculum required for students to earn different degrees. The schema is described in Paragraph 59

Schema for the Class Database
PREREQ id, prereq
CLASS id, name, descrip
REQ major, classid

Table 2

To display a directed graph of the classes needed to complete a Bachelor in Computer Science, we would submit the SQL+D query shown in Paragraph 59 .

SELECT REQ.classid, PREREQ.prereq FROM PREREQ, REQ WHERE REQ.major = “BSCS” AND REQ.classid = PREREQ.id DISPLAY PANEL main WITH Nodes (REQ.classid, PREREQ.prereq) AS boxedtext ON main.Center, Edge(PREREQ.prereq TO REQ.classid) ON main.Center

There are three basic steps of procuring a SVG graph from an SQL+D query. The first step is to generate an XSL stylesheet which generates the dot format. The structure of the stylesheet is standard for all queries. The webservice supplements the names of the respective attributes in the stylesheet for the nodes and edges as per the query. Each node is tagged by a unique integer(i.e., the row number) and its label is defined as the value of the attribute desired as a node. Each edge is defined as a pair of two nodes. The former node is the origin node and the latter node is the destination node.A part of the stylesheet is shown below.

<xsl:text>digraph G {</xsl:text>
		<xsl:for-each select="./Record">
		 	<xsl:value-of select="PREREQ.prereq" />	 	
			<xsl:text> -gt; </xsl:text>
			<xsl:value-of select="REQ.classid" />
			<xsl:text> ;</xsl:text>
	 	</xsl:for-each>
	 <xsl:for-each select="./Record">
	 	<xsl:value-of select="@ID" />
	 	
		<xsl:text>[label="</xsl:text>
		<xsl:value-of select="REQ.classid" />
		<xsl:text>"]</xsl:text> 
	 	
	 </xsl:for-each>
	 
<xsl:text>}</xsl:text> 	 	

The second step is generating the dot file from the XSL by applying the stylesheet to the resultset. This enables the user to get the SVG graph without having to generate the dot file manually. A sample dot file is shown below:

digraph G {
	1 -> 3;
	2 -> 3;
 
 	1[label="MTH 171"]
 	2[label="CS 164"]
 }

Finally, obtaining the SVG output is done running the program using the dot file. The output is appended to the Output Response. The client program will extract and use the SVG file from the Output Response. The resulting display is shown in Figure 7 .

graph1.jpg

Related Work

Jourdan, et. al. have developed the MADEUS [MADEUS] system. Similar to the system of Charon, it generates formatted output documents by using XSLT transformation and constraint technologies. A MADEUS [MADEUS] input document has four parts, Media Content and MediaUse parts for the logical structure, a temporal part for timing and synchronization and a spatial part for layout. Although, this is more powerful and flexible, our approach focuses on providing a higher abstraction to the user. In addition, the user need not have any special software installation except a SMIL browser conforming to the W3C standards.

Plewe [Plewe] has developed a GIS mapping system which allows querying a database and receiving output as SVG. Similar to our approach, this application applies an XSL translator to convert XML data into a meaningful SVG historical map.

Bibliography

[MADEUS]
Jourdan, M., Layaida N., Roisin, C., Sabry-Ismail, L., Tardif, L., "Madeus, An Authoring Environment for Interactive Multimedia Documents". Proceedings of the ACM Multimedia 98. 1998.
[Plewe]
Plewe, B. "A Simple Web Mapping Solution for Complex Spatial Databases", Proceedings of the SVG Open/Carto.Net 2002 Developer's Conference. 2002.
[GraphViz]
GraphViz, AT & T Labs Research,http://www.research.att.com/sw/tools/graphviz/, 2000.
[SVG]
World Wide Web Consortium on SVG, (http://www.w3.org/Graphics/SVG/Overview.htm8) .
[SMIL]
World Wide Web Consortium on SMIL, (http://www.w3.org/TR/smil20. ).
[SQL+D]
Baral C., Gonzalez G., Son. T.,"SQL+D: Extended Display Capabilities for Multimedia Database Queries" Proceedings of ACM’s Multimedia ‘98, Bristol, UK, September, 1998.
[SQL+D]
Baral C., Gonzalez G., Son. T., "Design and implementation of display specifications for multimedia answers" Proceedings of the 14th International Conference on Data Engineering, IEEE, February 1998.

XHTML rendition created by gcapaper Web Publisher v2.0, © 2001-3 Schema Software Inc.