André M. Winter
Institut für Geographie
Austria - Europe
institute webpage: http://geowww.uibk.ac.at/
project webpage: http://tirolatlas.uibk.ac.at/
Keywords: Atlas; Web-mapping; database driven, Tyrol
The Department of Geography at the University of Innsbruck created an extensive regional atlas with about 100 sheets and 200 maps in the period from 1969 to 1999. The work on a new, timely version goes on within the framework of a European Union - InterReg-IIIA project in which the country of Tyrol (Austria) and the province South Tyrol (Italy) are taking part. The read-only-medium CD-Rom is skipped to focus exclusively on a database driven internet variant. Printed papers will be derived and published on separate deepening subjects. Project period is 2002 to 2007 and the first online-version will be available in November 2002.
This project combines technical, statistical, political and linguistic challenges, so prior to focusing on technical aspects we would like to present some of these as well.
The covered region overlaps 4 countries and a historical problematic frontier between Italy and Austria separating Tyrol in 3 parts during WW1. In the European context communication has become easier and the existence of this project shows the declared intention to make borders less impermeable. As a region never ends at a borderline, the "Tirol Atlas" will have to look over the borders and thus cover parts of the neighbouring provinces of Austria (Kärnten , Salzburg, Vorarlberg), Italy (Alto Adige, Belluno, Brescia, Pordenone, Sondrio, Trento, Udine), Germany (Oberbayern, Schwaben) and Switzerland (Graubünden).
From a linguistic point of view two nation wide languages (German & Italian) are spoken and as we are dealing with an Internet atlas being viewed around the world (at least we hope so), we will add English support as well. Although (carto-)graphic aspects come to the fore there will always be text to be displayed in one of these three languages, causing difficulties with translation of technical terms and, in graphical context, with the varying text length for a single notion.
Although we get a good deal of support from responsible statistical offices and GIS departments it's still difficult to find a proper way to manage this data. All has its own specific national definition, census dates are different, map projections don't fit together and in some cases it's not even possible to get the needed data for the whole area. In addition handling data from a wide range of different sources requires strong efforts in defining and storing metadata to be able to track down where data came from, what it contains, how it was interpreted and where it resides in your own structure.
Although the "Tirol Atlas" is mainly a geographic scientific project, most of the working processes are definitely technical driven. These can be split up into the following areas:
Bringing an Atlas to the Internet needs more than displaying pictures. Maps need vector representation and to perform the later described tasks (and more than that), this format must be easy to handle, permit all needed graphical requirements (almost all DTP features) and allow adding interactivity and animation for all these elements. Right now Flash and SVG fulfil these requirements and the format discussion is already documented at Comparing SWF and SVG file format specifications and Standards im Internet (german).
The thematic "Tirol Atlas" DB stores harmonised datasets for the whole area. Its primary keys are municipality-IDs and due to the number of records to expect, data is split up in thematic tables such as "population", "economy", "tourism", "traffic", etc. Both primary (absolute) and computed (secondary) datasets are stored to speed up delivery of requests. As a lot of topics do not stick to community boundaries these will have to be integrated as well (climate, geology).
In front of the "Tirol Atlas" DB there is a so called "scratch DB" that holds most original statistical sheets as we get them from the public authority. Due to the differences within the original datasets (year of publication, census methods, covered areas, timeline intervals) there's no automated way to transfer columns from the scratch DB to the "Tirol Atlas" DB. This manual step also helps fixing diverse errors (typos, missing values, etc.)
In order to be able to generate maps out of the "Tirol Atlas" DB there is a need for a module that references datasets from the "Tirol Atlas" DB and adds information about how this data should be displayed. This "cartographic module" holds information about available themes, their internal order and maps belonging to them. Each map gets defined here with its data (links to rows in the "Tirol Atlas" DB), threshold levels, colour ranges, needed base geometry. The information managed here may be extended to other areas as internal linking, animation, blending themes and so on. Although it is clear what this module will have to do, its technical definition has not started yet.
Covering a too wide area at a too large scale it will be necessary to manage sequential data loading to the client (kind of session management). Other premises for the geometric DB are that the geometry may be accessed and delivered as geographic data and being translatable into SVG, that clipping, blending and other GIS-operations are practicable within the DB. To keep things simple the geometric and the thematic DB shall be the same. And last but not least it shall be Open Source.
Looking around for a suitable DB there are two areas of possibilities: GIS-based systems and DB-based systems. First handle most of the geometric demands but generally do this in their own internal format and export in predefined formats. Biggest drawbacks here are slowness (or gigantic server needs) and inflexibility, specifically concerning multi-way server-client-communication. DB-bases systems generally cannot perform geometric operations. According to our actual knowledge only two can: Oracle with its Spatial extension and PostgresSQL with the PostGIS extension. Due to time and money restrictions we had no time to start testing on Oracle Spatial, but the commercial aspect and finally the price are excluding shortcomings.
PostGIS adds support for geographic objects to the PostgreSQL database, follows the "Simple Features Specification for SQL" defined by the OpenGIS Consortium (OGC) and allows to store, index and query geometric features like point, line, polygon, multipoint, multiline, multipolygon and geometry collections in 2d as well as 3d coordinate space. Although you can store SVG path elements in mySQL as well, there's no way to perform queries on these columns. PostgreSQL in combination with PostGIS not only allows you to yield queries like "find overlapping elements", "get elements within a distance of" but also provides functions to compute area, perimeter, length, distances, translations and transformations from one coordinate system to another.
Most of the provincial governments in our area run ESRI based GIS systems, so PostGIS' ability to import and export "shape" files directly to and from the database is quite important. Joining geometry with statistical data "spatial enables" the database and opens the field for many applications. To be able to produce SVG content from the database we use Perl's DBI modules to query the database and a simple Perl module that translates "Well-Known Text representation" like POLYGON(0 0,0 1,1 1,1 0,0 0), LINESTRING(0 0,1 1,1 2) or POINT(0 0) to the appropriate SVG elements. During this process you can add IDs, styles and event-handlers to your SVG elements or even modify the geometry by clipping line output at your viewbox or converting absolute to relative coordinates - the latter being crucial for our application because we deal with coordinates that require up to twice seven digits per single value.
As our geometry does not change on a daily basis it's quite an overhead to perform all these calculations on every single request, so we consider storing these relative SVG path elements in a parallel column, perform queries on the actual geometry and deliver the "precompiled" SVG path instead. Once it's possible to handle splines (Bézier curves) this system should work for them as well. First attempts to automatically create splines from lines showed that the major task is to decide which and how many subsequent vertices are needed to build a specific spline. Although we found a way to create splines from lines using a fixed number of vertices at a time to create the spline, results were sometimes unpredictable - endless loops included in some situations. Nevertheless this seems to be more a problem of math competence and we're sure that this riddle will be solved soon.
The covered area is mapped at a scale of about 1:150.000, part of the data is as detailed as 1:50.000, so starting with a view of 400*300km on a 17" sized screen results in a scale around 1:2 million. Therefore it is impossible to display 1:1500.000 data at once without generalizing and reducing size. In raster based map display you don't have to care about download sizes - a map with 800*600 pixels will always have about the same size, no matter if you load 1 million lines in it on the server (with 1GB RAM) or not. As soon as you deliver vector data you have to define at what levels a specific geometry is visible or not. When zooming in (getting a larger scale) a reload of geometry will be needed to gain more detail, the same takes place when panning in "zoomed" mode. On the other hand you have to remove elements as soon as a user zooms out and leaves the detailed level. At this time you must drop existing geometry and replace it with better suited geometry to avoid the problem of RAM overload
The theory of these zoom levels may be recorded as zoom intervals with area restrictions, but the technique of updating vector information inside a loaded SVG stays the same. Testing this in panning context with tiles of topographic information that are loaded when the user pans around showed that keeping track of what elements are loaded and what elements should be removed is the major task to accomplish. Loading geometry as tiles raises the question how to cope with polygons that exist in both tiles - you never know from what direction the user reaches a specific tile, so polygons that touch two tiles have to be present in both as well.
Using Adobes getUrl implementation to dynamically load the tiles according to the current viewport position resulted in problems when trying to handle duplicated IDs. There's no possibility to remove elements from the returned document fragment prior to appending it in the main SVG. Once you've appended it duplicate IDs may exist. So how can you decide whether a polygon for a certain viewport has been loaded or not? How can you maintain the stack of your elements? It may be unreasonable to check all doubles client side during this process, nor doing this with server side session management. There will be a separate handling of passive elements (where doubles don't matter) and a specific handling for elements involved in interactivity. An unsolved and kind of philosophical question remains - how do you decide at what point an element is not needed any more?
There is no ensured literature or experience about how to design a map interface on the screen because most of these attempts are based on interim visualisation technologies. We have defined some premises to keep the system as open as possible for improvements:
There are no working example available at the moment of publication, but you may have a look at the project's homepage at http://tirolatlas.uibk.ac.at/ !