Accessing SVG Content Linguistically and Conceptually

SVG content consists of both the actual vector graphic elements that are in the SVG picture (file) and also what the picture "is about" or its meaning / semantics. In this paper we examine both types of content and see how computer programs recognize each. The former are explicit ('in the file'), the latter may be contained in Title or Description elements in an SVG file, or may be described in the metadata one way or another by RDF, XTM, or XGMML. It is also possible that 'what the picture is about' (ie its semantics / meaning) is NOT AT ALL explicitly described in the SVG file itself. (We look at 'external' metadata in this paper, too.)

The paper covers the programming which maps SVG 'code' to XGMML (XML Graph Modeling and Markup Language). The latter is a representation of the explicit SVG code elements, in a semantic network (graph) form. The paper also shows how XSLT can display the XGMML graph network as an SVG picture.

Linguistic accessibility of SVG picture information is shown in this paper by means of programs which input English sentences and output XGMML representation of the sentences. It is possible, by means of standard graph matching, to determine the presence or absence of the visual items described in the sentences, in a given SVG file / picture. In this way it is possible to search a collection of SVG pictures by content by using an English sentence to describe that content. (Search of all SVG elements not just Title and Description elements.)

Linguistic accessibility is further shown whereby, using English sentences as input, one can describe higher-level graphical / visual constructs. LR grammars are discussed as picture grammars. An example is shown in the paper where a 'happy face' is described in English, parsed into XGMML, which is used to locate an SVG file containing only the existing SVG constituent elements of the 'happy face'. The happy face does not explicitly exist anywhere, it is a "perception" comprised of the collection of certain visual elements in a certain configuration. Such a 'perception' is an INFERENCE. In this paper we see how this is done, with a 'face' and a 'business graph'.

W3C Linearization is briefly covered. A means of automating the 'discovery' of such predicates in arbitrary SVG files is shown. RDF, from 'linearization', is shown translated into English sentences by programs which are discussed.

XGMML graphs are shown being able to produce English which describes 'perceptual' content of SVG file. A 'business graph' and an 'electronic circuit schematic' are shown as examples. An SVG animation is shown as an example of where conceptualization of 'motion' and other visual changes are captured in XGMML and output as English sentences. Implicit or tacit visual content of SVG pictures is shown captured programmatically and stored as XGMML representation (which can be processed to produce English sentence output.)

Text-to-speech voice synthesis is briefly covered as a means of outputting text, but in a spoken form, to increase the dimensions of accessibility. Dragon Systems speech input systems are discussed as a complementary means of accessible text input to the above systems.

A knowledge-base, using the FRAMES technology, represented via XGMML graph structures is used for several semantic processing functions. These frames contain a description of all the parts of each SVG element in the specification (as appear in this paper). For example, a line is known to have certain attributes, like two end points, a thickness, a stroke colour, etc. (And other info, like 'must specify', 'optional presence', etc). Amongst the semantic tasks performed with these frames is 'good completion' (a term from Perceptual Psychology) and 'correctness'. The paper discusses these frames and how they are used by programs in the generation of English sentences about SVG (files).

Presented by David Dodds one of the original authors of SVG 1.0