The visualization of multi variable information

Keywords: SVG, HTML, JavaScript, user interface, dynamic reporting, visualization

Ed E.F. Stevenhagen
IT-specialist
Royal Dutch Airlines
Velddreef 293
Zoetermeer
2727CH
The Netherlands
Stevenhagen@xs4all.nl

Biography

Ed E.F. Stevenhagen, Senior Software Engineer / IT-consultant for the Royal Dutch Airlines (KLM) and GIS-consultantant as Stevenhagen Geo Informatica. Graduated at the Royal Military Academy. He got his pilot training at the Canadian Armed Forces. Studied Electrical Engineering at the University Eindhoven and GIS at the University of Utrecht. Involved in IT since 1969. Currently his interests concentrate on data management and quantitative reporting using visualization techniques as SVG, GIS in combination with web technologies. His interests include the survey of abandoned room and pillar mines. At the moment he is supporting a research project of the University of Leiden by surveying and mapping a tunnel complex of over 900 meter, connecting 5 bunkers of the Atlantic Wall and used as national bat reserve. Combining remote sensing techniques, Digital Elevation Modelling (DEM), Geographic Information Systems (GIS) and Scalable Vector Graphics (SVG) to present and analyze research data and automated logger data.


Abstract


This paper will illustrate the power of SVG in combination with JavaScript by showing an example of a data driven management information tool. Reducing a multi-paged, multi-column report to an easy to interpret visualization, accessible by any SVG-enabled browser. It will show you how to use the graphical attributes of the SVG elements like size, stroke, color, opacity, and position to present different reporting attributes.

The strength of the application shown is the ease to introduce and use in a low budget Intranet or Internet environment. It is compact, and portable. It will work on-line as stand-alone. It is data driven with an error tolerant data interface. The SVG graphics are embedded in an HTML-page and JavaScript controls the generation of both tables and graphics.

An application should be user-friendly. The user can customize his own default settings. The controls are accessible by use of your mouse, your numeric keypad or by defined hot-keys. You can choose to see the report as graphics or as a table with plain data.

The example will show how a manager of an organization of over 10,000 employees, will be able to pinpoint and trace the part of his organization where people are not feeling happy. JavaScript will do the first analysis and will translate the information into SVG graphic display attributes. The manager will feel invited to do the final interpretation.


Table of Contents


1. Introduction
2. Sick Rate Reports
3. Improving reporting
4. Visualization using Scalable Vector Graphics
5. The graphic Interface
6. Translating and symbolyzing sick leave report data
7. Report examples
8. Overview of the application architecture
     8.1 Defined mouse events
     8.2 Keyboard shortcuts
9. Some technical details
     9.1 Reading and interpreting datasets
     9.2 Creating Selection Options with dropdown menus
     9.3 Writing the report table
     9.4 Creating the SVG-page
10. Data sources, data storage and data access
     10.1 Initialization dataset
     10.2 Monthly datasets
     10.3 Data access
     10.4 Parsing parameters
11. Future development
12. Conclusion

1. Introduction

In five years we will realize that most people cannot read anymore and will only believe in what they see. The globalization will ask for a language independent style of reporting and communication: we call it visualization.

With the introduction of the computer we started producing bulky, often user-unfriendly, reports. First character based, then introducing simple static graphics. With Scalable Vector Graphic we are now able to create powerful dynamic intelligent graphics. Internet introduced a new way of distributing and accessing reports and data in an ever-growing frequency. The speed and volume often make it difficult for decision-makers to interpret this tsunami of information.

The overall purpose of this paper is to present some examples of using Scalable Vector Graphics (SVG) in reporting and in visualizing information. The design and the implementation techniques will be illustrated by providing an overview of the web application SIAG (Sick leave InterActive Graphics), designed to evaluate reported figures of sick leave of a major company.

More specially the document aims to detail the user side of the application architecture and the users interface by showing and evaluating some report examples. In brief some technical details are given on the use of JavaScript and especially regular expressions and information about the data sources used, the way of data access and storage and the parsing off parameters.

The examples are extracted from the results of case study and pilot in improving the effective evaluation of sick rate reports of an international operating transport company employing over 25,000 people. In the examples the names and references to this organization were changed.

image_00.jpg

2. Sick Rate Reports

Sick leave of employees is costly and difficult to predict. In an organization were you are working with strict schedules the information about the actual sick leave and the long-term trends is vital for the business process. For this reason the responsible managers will get a monthly report, distributed as text document by email or intranet showing him the sick leave figures of his area of interest.

The reported details are:

But still, recognizing trends, pinpointing the problem areas is difficult when a document with just straightforward figures is all you have.


Selection             Count of   Freq.pp   s.leave specific  sick leave %
Department       Empl Incid's   p.p./month.  days   length   month year
Department 1     113     11       1,39        71,0    7,18    2,1   3,0
Department 2     365     37       1,88       298,8    7,11    2,8   3,9
Department 3     448     56       1,66       456,7    9,66    3,5   4,7
TOTAAL example   925     104      1,74       826,5    8,33    3,1   4,2

Example 1: Original report layout

3. Improving reporting

A first step in improving the reports is done by the suppressing the less relevant figures. A next step is the use of colors to attract attention or to show the result of an evaluation. Also data suppression like not printing the not significant zero values is a simple way creating open space, necessary for orientation and navigation through a solid block of figures. Information suppression could be done by hiding information with dynamic selecting and filtering techniques. Often the same results without hiding information is obtained by selective focusing and attracting attention to other cells by changing the background of information cells and giving them a meaningful color .

image_01.png

Report improving using color and creating open space

In the example we use the color green to indicate a positive signal and the color red for a negative signal. All this formatting is done using Dynamic Hypertext Mark-up Language (DHTML) to create a Web document and using Cascading Style Sheets (CSS) under control of JavaScript.

The HTML-part of the SIAG application will produce this report and will use colors to signal figures that are higher or lower as the average. Also the application will suppress zero-counts and will recognize report lines with no employees reported but still reporting sick-rates for the past year. The report line 5 in the example shows a closed reporting unit with no employees left and no number of incidences for that month. You should not attract the attention of the manager on this line by presenting “0”sick leave days and a sick leave of 0.0 % and an every month declining yearly average. So the easiest way to do so is print no zeroes and giving the cell no special signaling color as the default color for the report. In the example of the report however, in line 1 one cell with a zero value is colorized but the value is not printed. It is a significant zero: in a unit with just one employee attached to it, the actual sick leave rate was indeed 0.0 %.

Nothing very special so far, but all together it will help a user reading and interpreting the table. But even more important is that a new user will recognize the old report and that will help him interpreting the SVG-way of presentation and will improve the acceptance of this new way reporting.

4. Visualization using Scalable Vector Graphics

The conventional report presented as a table is collection of flat arrays organized as columns and lines with every cell giving the information of just one attribute. With Scalable Vector Graphics (SVG) we can give every report a new dimension, we can even make it multidimensional. The next example will briefly illustrate what is meant by a dynamic multidimensional presentation using just one screen print from a project dealing with hibernating bats.

image_02.png

Visualizing wind and temperature in a reserve for hibernating bats

The screen print shows JavaScript controlled SVG-presentation of a part off the Atlantic Wall near Wassenaar in the Netherlands fitted up as reserve for hibernating bats. Five bunkers (red squares) are interconnected with a system of 900 meter of tunnels. To explain the behavior of the bats the University of Leiden closely monitors the climate conditions of the so called habitat. The example shows 30 locations (A) where at 6 positions the temperature is monitored. The air temperature is translated in a color and used as fill color. The same is done with the infrared measurements of the temperature of the walls that are translated in stroke colors of the tunnels. By means of the buttons at (E) it is possible to show an other sampling moment. The sampling moment is displayed under (D) and the horizontal position of (D) is dependant on the month of the year. The spread in temperature and at the same time the temperature index is automatically displayed at the bottom (C). And what looks like a simple arrow under (B) is a presentation of 3 dimensions or attribute values of the wind outside: the wind direction is translated in an orientation of the arrow, the wind speed is reflected in the size of the arrow and the color is determined by the air temperature. This to illustrate what is meant in this paper with multidimensional.

5. The graphic Interface

In the SVG-part of the sick rate report we use a modified bubble-chart to present the details of one department with reporting groups or cost units. Every bubble (circle) represents one reporting group or cost unit. The size of the bubble will indicate the number of employees in a reporting group.

There are three SVG templates defined for reporting. The templates differ in the way the bubbles are positioned and what attributes are used to calculate the x / y values:

  1. sick leave rate 12 month average / actual sick leave rate;
  2. incidence frequency / sick leave rate 12 month average;
  3. sick leave rate 12 month average / specific length of sick leave.

The scale of the graph can only be set to predefined values while the origin of the graph will not change position. So it is impossible that we loose our graph out of sight.

Selection options are available to select a report or step by a complete year or a step to another report month. By this way it is easy to compare all kind of situations or to recognize trends.

A special bubble is created for the grand total of the reported population. Two reference lines trough the center and parallel to the axis make it easy to recognize the position of each reported bubble in relation to the average represented by the center bubble.

image_03.png

Colors are used as warning signal (template-A) or to recognize the position of a reporting group in relation to predefined targets (templates-B en -C).

Template-A will plot the position using the actual sick leave rate and the 12 months average. Here a diagonal line will show were these values are equal. When the actual rate exceeds the 12 months average a signal is given by changing the color of the bubble from green into red.

Using template-B or template-C the program will use predefined targets for the value of the x-axis and the value of the y-axis. The targets are drawn as two straight lines at right angles to the axis dividing the graphical area into four quadrants. According to the quadrant of the plot the bubble is colorized. At a first glance one can recognize the position of a bubble in relation to the defined targets or in relation tot the grand average of the total population that was reported.

6. Translating and symbolyzing sick leave report data

The sick rate data are translated into an adopted bubble chart. Every bubble will represent a complete report line on a selected report moment:

  1. number of employees = size of the bubble
  2. number of incidences = area of the stroke width
  3. the frequency of incidences = transparency of the bubble
  4. actual/12 month sick rate = x,y position of the bubble
image_04s.png

By sizing the bubbles we give some weight to a reported value. It might be clear that a reported sick rate of a reporting unit of just 5 employees is quite different as a same sick rate reported for a reporting unit of 100 employees. And also the empty or closed reporting lines will not show at all.

The stroke size will immediately show where you will find the most new incidences. By making the area of the stroke direct proportional and not the stroke width we will automatically take the unit seize into account.

With the use of transparency it will be easier to indicate cases of long-term illness. Cases of long-term illness can have a very negative influence on the overall sick rate but will lower the frequency of new reported cases. In the graph these cases will be most likely recognized as transparent bubbles at the right or upper side of the graph.

7. Report examples

In the example below two different screen shots of two successive reporting months will show the reaction of the publication of reorganization plans. At the left a report before and at the right the report of the month that the results of a reorganization were published.

image_05s.png

Recognizing the impact of a reorganisation on sick leave rates.

image_06s.png

The overall picture is “negative” and automatically visualized with a red color. With a mouse click we lock on A to follow its behavior. Though the 12 month sick leave rate(horizontal position) is below the average of the whole population (left of the vertical red line), the actual sick leave rate (vertical position) has increase from 2.5 to 9 % and is significant higher than the average. Not only plot A and B but also the central plot C representing the total population show up with an increase of new incidences as displayed in the stroke-width of the bubble.

Another example using template-B will come up with preset targets in green for the incidence frequency and the 12 months sick leave rate. The red lines indicate the position of the total population and looking at the dark red color we immediately see that both targets were not hit.

image_07s.png

8. Overview of the application architecture

SIAG is designed to work in an intranet environment or as a stand alone application, on a SVG capable browser. The user has possibilities to make or convert his datasets in a simple text-editor without the need of special tools. Also the user can save his preferred startup and data settings by using a simple text-editor.

image_08.png

Using standard HTML, JavaScript and SVG, we have the advantage that we don't need to install or compile program sources and we don't need java or special active-X components. The application is driven by just one HTML page with just one embedded SVG-source. The user settings and all the datasets are stored as external JavaScript source (.js).

8.1 Defined mouse events

With a click event on a bubble or report line a JavaScript routine will toggle a mark on the line and corresponding bubble. By marking a bubble we can follow its behavior when the report date or the template is changed. An algorithm will recognize the bubble by its position on the list and its name and will accept some variations in the spelling of that name. This makes it possible to use a follow mechanism when no unique key is available

A mouse move event will trigger the translation of screen positions into report values.

With a mouse over event the name of the bubble is displayed and report details will be presented in a separate box.

8.2 Keyboard shortcuts

By pressing the spacebar the display will swap between the graph and the report table. By pressing “g” a toggle function show or hide of the grid is activated.

The selection of the scale, the reporting template and the change to another month can be done by using the pull down options at the top, by using the main keyboard or by using the numeric key pad. In this way the user has the choice to work by mouse or by key press, or just by using the numeric keys under his right hand.

Also small user friendly coding is implemented like a focus of the selection option on mouse over. In this way the selection will take one mouse click less. When the user is scrolling through the months the year value in the option menu will be synchronized automatically when passing a year end. And at startup the program calculates the most actual report to be expected at the know calendar date and will synchronize the date values for month en year accordingly.

9. Some technical details

JavaScript is used:

9.1 Reading and interpreting datasets

The dataset to be used is defined as an external JavaScript source (.js) in de HTML-page. The url of this dataset to be used is composed at loading time with use of default settings defined in the same page, with the settings of the user defined initialization dataset and the search attributes attached to the url HTML-page. More details in: Data sources, storage and data access.

9.2 Creating Selection Options with dropdown menus

The application creates several selection options that are presented as dropdown menus. Selection options are available for: selecting the svg-graphics (3), setting the graph scale (4), selecting the reporting group (user defined), selecting year(3) and month (12 + 2) of the report. The selection options are written by JavaScript as inserted html-code.

9.3 Writing the report table

The report table is inserted HTML-code written by JavaScript after accessing and interpreting the data. Extra processing is done to set the color of the cells and for the makeup of the fields. An click event is defined for marking and tracking a selected data line and the also marking the corresponding plot in the SVG-graph .

image_09.png

9.4 Creating the SVG-page

The data will also be presented in a SVG-page as a modified bubble-chart where the size of the bubble will indicate the number of employees of a reporting group as reported on a data line. To prevent small bubbles hiding behind larger ones, the dataset is sorted by a JavaScript routine in ascending order of employee count.

The reference to the dataset to be used and stored as JavaScript source, is written as HTML-code using JavaScript. And this is also important for the timing, because the SVG-page will on load start looking for this data in the HTML-page.

Next the text entries are created (createTextNode), x-y axis are build, and styles are set. The sorted table created in the HTML-page contains pointers to table of data items in descending order of employee count. For every table entry a JavaScript routine AddChartValue(index) is called for creating the bubbles as circle elements, setting the attributes id, radius r, setting the position rx,ry , and setting the style attribute with stroke-width, fill-opacity, fill:url(#….) , stroke. Finally the events for on click, on mouse over, on mouse out are set for the circle-element and with append Child added to the chart.

image_10s.png

10. Data sources, data storage and data access

In the HTML page all the defaults are hardcoded set for the startup such as date to start with, the kind of report, the naming conventions of the datasets. They are alldefined as JavaScript variables. In an user setup file defined as external JavaScript source, the same variables can be overwritten by new settings defined by the user.

10.1 Initialization dataset

In the HTML page all the defaults are hardcoded set for the startup such as date to start with, the kind of report, the naming conventions of the datasets. They are alldefined as JavaScript variables. In an user setup file defined as external JavaScript source, the same variables can be overwritten by new settings defined by the user.

10.2 Monthly datasets

All the datasets are defined and stored as external JavaScript sources. The layout is kept almost the same as the conventional reports. The storage location and naming of the datasets can be defined by the user in external user setup file.

The layout of the dataset is kept almost the same as the monthly reports (!) and could look like this:

varx = " "
+  "Apr 2005    AANTAL   AANTAL    MELDINGS   VERZUIM GEM.VER- VERZUIM %   "
+  "Example1      PERS.  MELDINGEN FREQUENTIE DAGEN   ZUIMDUUR ACT. 12 MND "
+  "Department1    365  37 1,88 298,8  7,11 2,8 3,9                        "
+  "Department2    448  56 1,66 456,7  9,66 3,5 4,7                        "
+  "Department3    113  11 1,39  71,0  7,18 2,1 3,0                        "
+  "TOTAL          925 104 1,74 826,5  8,33 3,1 4,2                        ";

In this way the user could use the old reports to fill the datasets by almost only "cut-and-paste". Keep in mind that the application is designed to be used completely at the client side using existing datasets or reports.

Because the javascript routines should find their way to the datasources, there are some restrictions in the naming of the datasets. The naming convention of the dataset (example: "pCA0503.js"):

- because is a JavaScript source, the extension ".js" is mandatory - the last 4 position of the dataset must indicate year and month (yymm) and - preceded by 2 positions indicating a unique reporting group - preceded by whatever the user put in the location parameter in the user setup.

The location parameter pdat (data path) is a JavaScript variable and describes the path to the stored data.

Example of the data path:

pdat  = "../data/*/p*" ; // the * is a wild character and will be interpreted as reporting group.

In the example the program will look for the data "CA" at

href="../data/CA/pCA" + yy + mm+ ".js";

where yy is the reporting year and mm is the reporting month.

10.3 Data access

All the selections that will show up in the selection options can be defined or changed by the user in changing the content of the variable seloption in the user setup.

Example for the content of the variable seloptions.

seloptions = ""
+ "[CA| Totals      ]"   
+ "[CB| Staf        ]"   
+ "[CC| Operations  ]"
+ "[CD| Development ]";

As a result a JavaScript routine will create a dropdown menu at startup with 4 options with the description "Totals" for the first option. Also it will pass the code "CA" for the reporting group, as part of the dataset name for the selection "Totals".

10.4 Parsing parameters

The HTML-page is designed to accept and interpret parameters defined as search attribute attached to the url. Every time a new dataset is needed a JavaScript routine will compose a new search attribute and calls for a change of the page by setting a new location.href. Modifying window.location.href will result in a refresh of the current page with the same page but with different settings or dataset defined in the search attribute of the URL.

The resulting url could look like:

siag.html?data=CA0507&scale=1&option=A&color=Y&graph=Y&table=N&focus=O&gridx=N

A JavaScript routine will look for the content in window.location.search. In this example the program will look for a dataset of reporting group "CA" of July 2005, and will use template option "A" for the presentation in SVG to start with, and Yes if we look at the tables we want to see colors (=Y), but Yes we start with a graph, but without a report table. We want to focus on the first selection options (focus=0), but No don't show a grid show up on the graph.

11. Future development

The application SIAG (Sick leave InterActive Graphics), was designed using a structured set of flat datasets. Redesigning the application and using a central database would facilitate more powerful and companywide applications.

Because of the structure of the datasets the application will only present the results of the data with respect of one month. It would be an enrichment of the functionality to be able to present and evaluate the complete history of reported figures.

12. Conclusion

The strength of the application described in this paper is its data and JavaScript driven design with a simple error tolerant data interface. It is a straight forward design with a complex of small routines to make the system act user friendly. The web-based architecture makes it easy to introduce and to use in a low budget Intranet or Internet environment. It has proven the be fast and high performance.

However, Scalable Vector Graphics it is a new technique and therefore asking for new skills. In an organization were software is developed by a select group of contractors it might be difficult to get enough support for “innovation out of standard”.

For the user it is a new way of presenting and interpretation of reports. And often resistance is the answer to innovation.

The application is build with no other tools as an ordinary text editor. The installation also will not ask for more than a SVG capable browser. I call it low budget development and low budget implementation. But when someone is used driving a Rolls Royce, the suggesting of using a cheaper way of transport with more functionality may lead you to a side-track. It is like adding a new dimension in a world that is supposed to be flat.

XHTML rendition made possible by SchemaSoft's Document Interpreter™ technology.