Abstract
Native SVG-in-browser is used to visualize the enrollment data from San Jose State University (SJSU) 2009 Spring meeting time schedule and building locations. SVG renders a SJSU campus map. Each building is fully or partially gradient filled to show percentage occupancy. At the bottom below the map, there is slider bar with markings from 07:00 to 22:00 representing the class meeting time in hours. When the slider moves, the gradient dynamically updates the filled region according to the calculated schedule information data. A click event on a building shows more detailed information about that building. For detail-view feature, low-opacity layer representing information will display on top of the map. The detail-view contains a pie chart and textual data related to the building occupancy.
Like most Powerpoint or any presentation slides, people usually prefer visualization over text. Similar to overwhelmed reports, web pages, users would not appreciate if they spend more than a few seconds to spot the needed data. Some may argue that intended users should know and understand their designated data. However, due to overloaded data, users won't be able to look and analyze every single bit and byte of the actual data. Instead of spending time to make sense out of a huge amount of data, users should cut down the time and start working with data in a high-level view. Furthermore, humans tend to be able to make more sense out of pictures and graphics than textual information. Most of the time, visualization is the best way to capture the overall data. For instance, a school administration wants to make class schedule and building allocation for a semester based on previous year's enrollment information. Proper building assignment, classroom allocation, and schedule can reduce fire hazard or natural disasters, but gathering information from raw data would be hideous due to data overload, even if the data is well-formatted. To help solve these mundane administrative tasks, the Maptistic application is proposed. The Maptistic application aims to ease administrative workload, and to help to make classroom and building utilization more efficient. The native SVG-in-browser application is used to visualize the enrollment data from San Jose State University (SJSU) 2009 Spring course meeting time schedule and building locations. SVG renders a SJSU campus map. At the bottom below the map, there is slider bar with markings from 07:00 to 22:00 representing the class meeting time in 24 hours. When user clicks on slider, each building is fully or partially gradient filled to show percentage occupancy at the time period marking on the slider. When the slider moves, the gradient-fill dynamically updates the filled region according to the calculated schedule data . Click-events on building shows detailed information about the building. A detailed-view, which is a low-capity layer, will display on top of the map; this detail-view contains a pie chart and summary information related to the building occupancy.
The application also uses JavaScript to accomplish interactivity:
Hiding text element whenever the slider activates
Bringing up or dismissing the detail view
Sliding the time-slider
Updating building gradient-fill
Updating pie chart
Updating detailed-view information
Preprocessing - data acquisition
Post processing - building XML data files
Pre-Rendering SVG via JavaScript onload() function
Transforming Inkscape-generated SVG to render and support interactivity via JavaScript
JavaScript Event processing
XML data is assembled by “scraping” the SJSU official 2009 Spring
schedule found at
http://info.sjsu.edu/web-dbgen/soc-spring-courses/all-departments.html
. The data was pulled and saved into files by a Java program. The
Java program contains 3 files: Main.java, SimpleHTTPGet.java, and
SJSUScheduleHTTPGet.java, where SimpleHTPGet.java and
SJSUScheduleHTTPGet.java are in one package called
edu.sjsu.netutils. The Java program utilized default Java built-in DOM
library for traversing the DOM and used com.w3c.dom library for
pretty-format output.
Main.java: program driver
SimpleHTTPGet.java: interface file, defined methods to pull data
SJSUScheduleHTTPGet.java: implementation of the interface
The main content of the course web page contains key fields, such as: Schedule, Title, GE
Designator, Footnotes, Section Code, Units, Type, Enrollment, Days,
Time, Dates, Location, and Instructor. Instead of scrapping selected
data, the system will scrape every fields except for GE Designator
and Footnotes to avoid re-scraping for future application expansion.
Fig. 1 is a screen-shot of the important content within a
course web page . The Java program implements a web crawler which
will fetch HTML pages. Each page presents one course, so the crawler must
visit all the departments pages. Through regular expressions,
the program will extract only essential information from the page to construct
a only XML node element <course offered in that particular department.
When the crawler finishes visiting all the links in one department,
it outputs a single XML file composing all the courses offered in that department.
The output file has the following XML format
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <schedule term="Spring" year="09"> <department name="COMPUTER SCIENCE"> ….... ….... <course courseURLid="c1165521"> <title>Programming in Java</title> <section>01</section> <code>27948</code> <units>3</units> <type>SEM</type> <enrollment> <registered>21</registered> <totalspace>30</totalspace> </enrollment> <days>TR</days> <time> <starttime>1330</starttime> <endtime>1445</endtime> </time> <location> <building>MH</building> <room>225</room> </location> <instructor>J Smith</instructor> </course> ….... </department> </schedule>
There were 130 departments in SJSU in Spring 2009, so the crawler created 130 individual XML files.
When the Java crawler finishes saving all 130 XML files, a Perl
script will concatenate all the department's XML data into one big XML file which contains 6192 <course
nodes indicating 6192 classes were offered in Spring 2009 semester in all deparments. The final result XML file serves as the data source for
statistic calculation on user interactions.
During the onload()
event, the system will load the XML data file into an JavaScript XML DOM object and
perform data loading. The system accesses the DOM object and creates
an associative array of schedule data. The outer hash key is the class time meeting, and the value is the array
of key/value object. The inner hash key is the building name abbreviation(e.g. DH, MH, etc) , and the value is the
BuildingCapacity with a container having 3 values. Each time the system sees a course with the same building name abbreviation and time, it will
add the class space to the BuildingCapacity object's “space” field and add the number of registered seats into the “taken” field. The system
will also create a new element and insert into the hash if the system cannot find the hash element with the building name.
Building capacity class contains 4 fields:
name = building name as string
space = total space available for all offered classes at a given time as integer
taken = number of registered seats as integer
count = number of classes in one building at a given time as integer
//The following the data structure of the BuildingCapacity object
function BuildingCapacity(name, space, taken){
this.building_name= name;
this.total_space= space;
this.registered=taken;
}
The following process will perform data structuring.
Using XPath to retrieve all
<course> nodes in the data file
Iterate through all nodes
Get the <registered> node and assign to an variable reg
Get the <space> node and assign to an variable spa
Get the <starttime> node and assign to an variable tim
Get the <building> node and assign to an variable bld
If reg, spa, tim, and bld are not null
Parse the hour from <starttime> and assign to variable key
If Hash[key] doesn't exist then create a new array and assign to the key index
Assign Hash[key] to subArray
Check if subArray[bld] is nothing
True
create an BuildingCapacity object with name, spa, and tim as constructor parameters
False
add spa to subArray[bld]'s space_total field
add reg to subArray[bld]'s registered field
Add 1 to subArray[bld]'s count field
The following Javascript code is the implementation of the above process
//Using Xpath to query all the <course> node in the XML data file
var course = xmldoc.evaluate("/schedule/department/course", xmldoc, null, XPathResult.ANY_TYPE, null);
result = course.iterateNext();
//iterate through nodes
while(result)
{
try
{
//get the registered field from course
reg = result.getElementsByTagNameNS(null, "enrollment")[0].getElementsByTagNameNS(null,"registered")[0].firstChild.nodeValue;
//get the space from the course
spa = result.getElementsByTagNameNS(null, "enrollment")[0].getElementsByTagNameNS(null,"totalspace")[0].firstChild.nodeValue;
//get the start time from the course
tim = result.getElementsByTagNameNS(null, "time")[0].getElementsByTagNameNS(null,"starttime")[0].firstChild.nodeValue;
//get the building name
bld = result.getElementsByTagNameNS(null, "location")[0].getElementsByTagNameNS(null,"building")[0].firstChild.nodeValue;
if (reg && spa && tim && bld)
{
key = tim.substring(0,2);
if (!dataH[key])
dataH[key] = new Array();
subArray = dataH[key];
if (!subArray[bld])
{
//create new BuildingCapacity object if there is no element associate with this index
subArray[bld] = new BuildingCapacity(bld,parseInt(spa), parseInt(reg));
}
else
{
//update total space and registered field
subArray[bld].total_space += parseInt(spa);
subArray[bld].registered += parseInt(reg);
}
subArray[bld].count++;
}
}catch(e){
//do nothing for now
}
result= course.iterateNext();
}
The following is an example of the hash of hash data structure [07] → [mh] → [mh, 40, 30] [sci] → [sci, 50, 49] [eng] → [eng, 120, 120.] …. …. [08] → [mh] → [….] [sci] → [….] [eng] → [….] …. …. [09] → [mh] → [….] [sci] → [….] …. …. [22] → [mh] → [….] [eng] → [….] …. ….
To produce the SVG campus map, Inkscape was used. I imported the PDF map file which can be downloaded
from http://www.sjsu.edu/about_sjsu/docs/SJSU_campus_map.pdf,
then remove unimportant data and saved the edited content into a plain SVG file.
Inkscape produces SVG <path> elements to draw building shape. To make DOM manipulation easier,
I inserted <g elements liberally in the .svg file for grouping buildings, text, and unique id for each <path and <text elements.
Fig. 2 is the SVG-rendered image of MacQuarrie Hall in SJSU campus generated by Inkscape.
Now the buildings are grouped and identified with <g tag and id attributes. Clicking on the slider will apply gradient fills on buildings that are used for class meeting. On mouse up event triggers, the gradient will disappear. The gradient contains two colors which are yellow and blue. The yellow color represents the vacancy while the blue color represents the occupancy of the building
The following gradient tag was used to produce the gradient-fill of MacQuarrie Hall as in Fig. 3 when slider is clicked.
<linearGradient id="gradient_model" x1="0%" y1="0%" x2="0%" y2="100%" visibility="hidden"> <stop offset="0%" stop-color="blue" /> <stop offset="0%" stop-color="yellow" /> </linearGradient>
The below Javascript will update the gradient fill when the slider moves
….....
for (i in class_bld)
{
name = class_bld[i];
key = "gradient_" + name;
element = svgDocument.getElementById(key.toLowerCase());
if (element){
if (array != null && array[name.toUpperCase()] != null){
percentage = array[name.toUpperCase()].registered
/array[name.toUpperCase()].total_space;
element.getElementsByTagName("stop")[0].setAttribute("offset", percentage);
}
else
{
element.getElementsByTagName("stop")[0].setAttribute("offset", 0);
}
}
}
}
….....
In addition to the general view of all building's occupancy, users also can activate detail-view mode for more information about a building by clicking on the building on the map. When the detail-view mode is activated, a low-opacity rectangle will appear on top of the building. Inside the big rectangle is a pie chart presenting the distribution as shown in Fig. 4. On the right side of the pie chart, there will be some text showing the number of classes offered at a given time, number of total space available, and the number of registered seats. The following SVG and JavaScript code is used to create and update the pie chart when the slider moves.
<g transform="translate(150,300) scale(2)" id="pieChart">
<circle id="circle" cx="0" cy="0" r="50" fill="orange" stroke="purple"/>
<path d="M0 0 H 50 A 50 50 0 0 1 0 50 Z" fill="red" stroke="blue" id="wedge"/>
</g>
//javascript to modify pie chart
….
x = 50 * Math.cos(percentage*2*Math.PI);
y = 50 * Math.sin(percentage*2*Math.PI);
z = (percentage > .5 && percentage < 1.0) ? 1: 0;
svgDocument.getElementById("wedge").setAttribute("d","M0 0 H 50 A 50 50 0 " + z + " 1 "+ x + " " + y + "Z");
…..
Determining what are the best ways to visualize meaningful data was the most challenging issue faced. There were many visualizing techniques considered options, such as the following:
Bar chart over buildings, but bar chart is too small for users
Number percentage drawn on each building (e.g. 90%), but text is not a good visualization choice with so many buildings and small font-sizes
Gradient filled, overlayed transparent pie chart
Building clip-path over 2 colors to achieve the same effect as the gradient fill
Some combinations of above
The segmented code from Inkscape PDF import presented coding challenges. Inkscape produces may <path> elements code and buildings with different grouping and transformations on viewport. Other issues included the determination of which data should be in the XML file. Determination the best suitable XML schema should be used in order to retrieve the data quickly with XPath or DOM.