Clinical Data to SVG scientific graphs

Study analysis using XSL, SVG and XML

Keywords: pharmacokinetic SVG graph, clinical study analysis, XSLT extensions, Operational Data Model

Johanne Jean-Baptiste, M.Sc.
Statistics Canada
Ottawa
Ontario
Canada
johanne.jean-baptiste@statcan.ca

Biography

Johanne Jean-Baptiste studied in health sciences and worked several years in the pharmaceutical industry. One of her mandates was to manage submissions of clinical study reports and data to regulatory agencies for a contract research organization. Johanne has broadened her knowledge base by pursuing studies in information systems and is currently a computer scientist at Statistics Canada. She is interested in the application of XML technologies in science.


Abstract


This presentation summarizes an exploration into the use of XSL transformations and SVG applied to the domain of clinical study analysis. Clinical study data is routinely tabulated and graphed for analytical purposes and regulatory submissions. Using XSLT, tables and graphs of study results are generated from an XML instance. SVG graphs of subjects' drug concentration values are produced and embedded into a web page.

To simulate the impact of different results on the pharmacokinetic profile, it is possible to change the data in the tables and update the SVG graph interactively. Basic pharmacokinetic parameters are also calculated from the XML instance with XSLT extensions and Javascript. Other study data, such as adverse events, are presented in summary form. A study report in PDF format is also produced from the same XML instance using XSL-FO.


Table of Contents


1. Clinical trial data
     1.1 Data formats
     1.2 The Operational Data Model's XML schema
2. Clinical study analysis with ODM, XSLT and SVG
     2.1 XPath to subject data
     2.2 SVG graph: the pharmacokinetic profile
         2.2.1 Interacting with the SVG graph
     2.3 XSLT extensions: pharmacokinetic calculations
     2.4 Adverse events table
     2.5 Data simulation
3. Clinical study report
4. Conclusion
Footnotes
Acknowledgements
Bibliography

1. Clinical trial data

Clinical trials are studies of the outcome of medical treatments on selected patients or subjects. Typically the treatments include the administration of a therapeutic drug. The purpose of clinical trials is to document a treatment's safety, efficacy, or pharmacological and systemic actions. Precise and detailed documentation is crucial for the proper analysis and archival of study data. Therefore, clinical studies often produce a large quantity of technical documentation.

1.1 Data formats

The information on how to conduct a clinical study (its design) and the data gathered from its execution are saved in many different formats: text documents, databases, spreadsheets, plain text files with various layouts or other proprietary formats. These formats are spread across Clinical data management systems, LIMS (Laboratory Information Management Systems) , statistical and other data analysis software packages.

With this abundance of data formats and systems, a lot of effort (read time and money) is expanded to take data from one format to another in order to analyze it and exchange information between individuals, companies and regulatory agencies responsible for approval of treatments. To facilitate the exchange of clinical data the CDISC (Clinical Data Interchange Standards Consortium) has been working on a schema that specifies the structure of clinical data in an XML (Extensible Markup Language) format.

1.2 The Operational Data Model's XML schema

The ODM (Operational Data Model) is the proposed CDISC schema for clinical data. According to CDISC

The Operational Data Model (ODM) is a vendor neutral, platform independent format for interchange and archive of data collected in clinical trials. The model represents study metadata, data and administrative data associated with a clinical trial.

The ODM 's administrative data consists of information about the individuals and organizations conducting the study. The metadata represents general information about a study's design.

2. Clinical study analysis with ODM, XSLT and SVG

This research explored the possibility of using an ODM instance to fulfill common clinical trial data analysis tasks using XML technologies. A simplified version of a clinical trial was produced in ODM format. This study compares blood concentrations of 2 different formulations of a drug in a small sample size of subjects.

The requirements for the transformation of the ODM instance were as follows:

  1. The production of SVG (Scalable Vector Graphics) graphs of drug concentrations over time.
  2. The production of summary tables of subject demographics, laboratory and physical examination findings, and adverse events data.
  3. The calculation and tabulation of pharmacokinetic parameters based on the drug concentration data.
  4. The capacity to perform simulations by updating the tables and graphs interactively from user changes to drug concentrations and parameters.
  5. The production of a report document including text, tables and graphs.

The tools used were:

2.1 XPath to subject data

According to the ODM schema, all the data collected on subjects during a study is contained within the SubjectData elements. Different types of data (also called domains) are organized into sections called forms that are the equivalents of the CRF (Clinical Report Forms) . Each form is further divided into data item elements. The ODM allows users the flexibility to determine, in the metadata section, which forms will be used to capture study data and within each form which items will be present. For example, the demographic data collected can be grouped as described below. Note that the StudyEventOID, FormOID and ItemOID attributes refer to the definitions of the form and items in the metadata section. A detailed discussion of the ODM schema is available at www.cdisc.org

   
    <SubjectData SubjectKey="1" >

        <StudyEventData StudyEventOID="event.demog" >
            <FormData FormOID="form.demog">
                <ItemGroupData ItemGroupOID="group.demog">
                    <ItemData ItemOID="item.demog.subjid"   Value="JK01" />
                    <ItemData ItemOID="item.demog.gender"   Value="F" />
                    <ItemData ItemOID="item.demog.age"      Value="30" />
                    <ItemData ItemOID="item.demog.race"     Value="Asian" />
                </ItemGroupData>
            </FormData>
        </StudyEventData>
        
        <!-- other study event data -->
        
    </SubjectData>

In the stylesheet, the nodeset of a form can be saved into a variable and then processed into the chosen output. Selecting the demography nodeset from the above example can be done with XPath (XML Path Language) as follows.

   
   <xsl:variable name="demog" 
    select="//odm:SubjectData[@SubjectKey=$subject]//odm:FormData[contains(@FormOID,'.demog')]" />

Then, the nodeset is transformed into table cells.

   
    <tr class="demogRow">
        <td><xsl:value-of select="$demog//odm:ItemData[@ItemOID='item.demog.gender']/@Value" /></td>
        <td><xsl:value-of select="$demog//odm:ItemData[@ItemOID='item.demog.race']/@Value" /></td>
        <td><xsl:value-of select="$demog//odm:ItemData[@ItemOID='item.demog.age']/@Value" /></td>
    </tr>

Laboratory and physical examination findings were summarized in the same way.

2.2 SVG graph: the pharmacokinetic profile

The XSLT stylesheet contains parameters to either output an XHTML page or an SVG graph. The generated graph is essentially a shell into which the drug concentration data (pharmacokinetic profile ) of each subject will be drawn dynamically upon user selection. However, for producing the report, full SVG graphs are written as described in Chapter 3 .

The web page contains an embed element pointing to the SVG graph. In that way, the SVG graph can be viewed from the page and its DOM (Document Object Model) can be manipulated with Javascript.

The SVG file contains a placeholder <g> element for the graph.

    <g id="chart"  style="filter:url(#DropShadowFilter)">
      <g id="canvas" >
        <use xlink:href="#pAxes" style="stroke:black; stroke-width:1;" />
        <rect x="0" y="0" width="98%" height="98%" style="fill-opacity:0.1; fill: rgb(135, 206, 250); " />
          
           <!-- Graph Transform Log / Linear choice -->
           
           <!-- Legend -->
           
      </g>
    </g>           

When a subject is selected on the web page, the pharmacokinetic profile is generated, from the subject's drug concentration form, by adding circles (for the data points) and paths (the lines that link the data points) to the SVG graph's canvas element with Javascript. The ODM instance is also embedded in the web page allowing access to its DOM . The drug concentration data from the ODM instance is transformed from this

<ItemGroupData ItemGroupOID="group.drug.conc" ItemGroupRepeatKey="A">
    <ItemData ItemOID="T1"  Value="0.00" />
    <ItemData ItemOID="T2"  Value="4.40" />
    <ItemData ItemOID="T3"  Value="6.90" />
    <ItemData ItemOID="T4"  Value="8.20" />
    <ItemData ItemOID="T5"  Value="7.80" />
    <ItemData ItemOID="T6"  Value="7.50" />
    <ItemData ItemOID="T7"  Value="6.20" />
    <ItemData ItemOID="T8"  Value="5.30" />
    <ItemData ItemOID="T9"  Value="4.90" />
    <ItemData ItemOID="T10" Value="3.70" />
    <ItemData ItemOID="T11" Value="1.05" />
</ItemGroupData>

 <!-- other item group data -->

<ItemGroupData ItemGroupOID="group.drug.conc" ItemGroupRepeatKey="B">
    <ItemData ItemOID="T1"  Value="0.00" />
    <ItemData ItemOID="T2"  Value="1.89" />
    <ItemData ItemOID="T3"  Value="4.60" />
    <ItemData ItemOID="T4"  Value="8.60" />
    <ItemData ItemOID="T5"  Value="8.38" />
    <ItemData ItemOID="T6"  Value="7.54" />
    <ItemData ItemOID="T7"  Value="6.88" />
    <ItemData ItemOID="T8"  Value="5.78" />
    <ItemData ItemOID="T9"  Value="5.33" />
    <ItemData ItemOID="T10" Value="4.19" />
    <ItemData ItemOID="T11" Value="1.15" />
</ItemGroupData>

to this

PKprofile.png

Figure 1: Basic pharmacokinetic profile

2.2.1 Interacting with the SVG graph

The user can interact with the SVG graph in the following ways. In addition to the SVG viewer's panning and zooming functions, tool tips are available to show values at each data point, and it is possible to change the graph representation between linear and log transformations. The log representation is often used in pharmacokinetic analysis. The figure below shows the graph of Figure 1 that has been log transformed and zoomed to display information about data points that are very close. The data points can be changed also as will be discussed in Section 2.5 .

PKprofile2.png

Figure 2: Tool tips and log transformation

2.3 XSLT extensions: pharmacokinetic calculations

Analysis of the pharmacokinetics [1] of a drug includes the calculation of its distribution in the body along with its rates of absorption and excretion. The pharmacokinetic parameters calculated from the drug concentration values provide this information. For this research, the following basic parameters were calculated.

Definition of pharmacokinetic parameters
Parameter Equation / Definition
Descriptive parameters

Cmax (Maximum observed concentration value)

Tmax (Time point at which Cmax occurs)

Calculated parameters

AUCt (Area Under the Concentration time curve)

AUC(0-t) = ∑ (ti+1 - ti)/2 ×(Ci + Ci+1)

  • the summation is from time t=0 to n-1, where n is the number of data points.

AUCinf (Area Under the Concentration time curve with the last concentration extrapolated based on the elimination rate constant Kel )

AUCinf = AUC(0-t) + (Cn/Kel)

Kel (Elimination rate constant)

λz = - ln(10) × s

  • where s is the slope between the chosen start and end points.
  • In this example, by default, the last 3 time points are chosen.

Thalf (Half-life of drug elimination)

t½ = ln(2) / Kel

Table 1

Stylesheet extensions were used to select or calculate these parameters. Cmax and Tmax were selected with the extensions library available in Xalan. The namespace assigned is xmlns:math="http://exslt.org/math" and the maximum function is invoked this way <xsl:value-of select="math:max($concentrations//odm:ItemData/@Value)"/> .

Calculations of AUCt , AUCinf , Kel and Thalf are more complex. Therefore, the functions were implemented in a Java class and then declared as either custom extension elements or functions to be called from within the stylesheet. Since these parameters are inter-dependent, they are all calculated at once per treatment (with the extension element <pkcalc:calculateParams/>), and saved into a result tree fragment by calling the appropriate extension functions.

   <xsl:variable name="params" >
    <params>
       <xsl:for-each select="//odm:SubjectData[@SubjectKey=$subject]
                             //odm:ItemGroupData[contains(@ItemGroupOID,'drug.conc')]" >
                             
             <pkcalc:calculateParams/>
             <xsl:element name="{@ItemGroupRepeatKey}">
                <auct>   <xsl:value-of select="pkcalc:getAUCt()"/>   </auct>
                <aucinf> <xsl:value-of select="pkcalc:getAUCinf()"/> </aucinf>
                <kel>    <xsl:value-of select="pkcalc:getKel()"/>    </kel>
                <thalf>  <xsl:value-of select="pkcalc:getThalf()"/>  </thalf>
            </xsl:element> 
       </xsl:for-each>    
    </params>
   </xsl:variable>

The result tree fragment variable is converted into a node with Xalan's nodeset function so that it can be searched with an XPath expression. Then the values of each parameter are later retrieved from the result tree fragment, where required in the stylesheet, in this way: <xsl:value-of select="xalan:nodeset($params)/params/*[name()=$treatment]/auct"/> .

2.4 Adverse events table

Adverse events monitoring and documentation is very important in clinical study analysis. In this example, to notice at a glance the severity of adverse events, they are tabulated from the ODM with a color code added during XSLT processing. Accessing the adverse event nodes is done in a way similar to that of the demographics example presented in Section 2.1 . In the following table, severe adverse events are colored pink.

PKadverse.png

Figure 3: Section of the adverse events table

2.5 Data simulation

During pharmacokinetic analysis one often wants to find out the impact of changes in concentration values or elimination slope start and end points on parameters and profiles. The web page's concentration table values and Kel start and end points can be changed. When an update is requested, data points placements on the SVG graph and the parameters are recalculated with Javascript. For this presentation, the updates are done client-side. It is, of course possible to add dynamic update server-side with server pages or scripts.

The following graphics give before and after snapshots on the effect of changing a concentration and the Kel points on the profile and parameters. Note that for the first treatment (A) the Cmax was increased, and for the second treatment (B) the start point of Kel was decreased with an effect on Thalf . The faint lines represent the calculated drug elimination slopes.

PKupdate-b4.png

Figure 4: Graph and parameters table before update

PKupdate-after.png

Figure 5: Graph and parameters table after update

3. Clinical study report

A simplified clinical report was produced in PDF format using XSL-FO (Extensible Stylesheet Language Formatting Objects) language. Using the same ODM instance, the stylesheet can output an xml file with FO (formatting-objects) semantic tags. This file is then processed with Apache's FOP processor into a PDF file. This processor can also handle SVG embedded in a <fo:instream-foreign-object> element as long as the namespace is explicit on each element. For the pharmacokinetic profile, each data point is placed onto the SVG graph's canvas by taking into account the maximum observed concentration per subject and the chart's width and height. To simplify the calculation of each data point's placement, instead of using XSLT, the nodes are generated from an extension element from the Java class used for the calculation of pharmacokinetic parameters. The node is then placed into the result tree using <xsl:copy-of/>

   

    <svg:g id="chart" 
           xmlns:svg="http://www.w3.org/2000/svg" 
           xmlns:xlink="http://www.w3.org/1999/xlink" >

      <svg:g id="canvas" >
        <svg:use xlink:href="#pAxes" style="stroke:black; stroke-width:1;" />
            <svg:rect x="0" y="0" width="98%" height="98%"
                      style="fill-opacity:0.1; fill:rgb(135, 206, 250); " />
        
           <xsl:variable name="PKprofile">
               <xsl:for-each 
                    select="//odm:SubjectData[@SubjectKey=$subject]
                            //odm:StudyEventData[contains(@StudyEventOID,'drug.conc')]">
                   <pkcalc:PKprofile/>
               </xsl:for-each>
           </xsl:variable>
           
           <xsl:copy-of select="xalan:nodeset($PKprofile)"/>
           
           <!-- other graph code -->
           
      </svg:g>
    </svg:g>

4. Conclusion

This research demonstrated how XML languages or dialects (XSLT, SVG, XSL-FO) can be applied to solve problems of data analysis and multiple formatting specific to clinical studies. It shows also how, for different levels of complexity or programming choice, one can use either XPath functions, the XSLT extension library or customized XSLT extensions specific to a scientific domain. In addition, it provides a small example of how the ODM could be used as input for web-based analytical applications.

Footnotes

  1. Pharmacokinetics is the study of the absorption, distribution, metabolism and excretion of drugs from the body. A pharmacokinetic profile is the plot of concentration data points for each collection time point.

  2. This data was adapted from a [SAS/STAT] user guide data set taken from a publication by Pinheiro and Bates (1995).

Acknowledgements

The author wishes to thank members of the Systems section in Statistics Canada's Geography division for their support, and the xml.apache.org committers and contributors for providing excellent, free and customizable software.

Bibliography

[ODM]
The Operational Data Model, CDISC, Dec. 2003 (www.cdisc.org)
[SVG]
Scalable Vector Graphics (SVG) 1.1 Specification, 14 January 2003 (www.w3.org/Graphics/SVG)
[XSL / XSL-FO]
Extensible Stylesheet Language (XSL) Version 1.0, 15 October 2001 (www.w3.org/TR/xsl)
[XSLT]
XSL Transformations (XSLT) Version 1.0, 16 November 1999 (www.w3.org/TR/xslt)
[XPath]
XML Path Language (XPath) Version 1.0, 16 November 1999 (www.w3.org/TR/xpath)
[Boomer PK Manual]
Boomer PK Manual (www.boomer.org)
[SAS/STAT]
SAS/STAT User's Guide for versions V7 and V8, The SAS Institute Inc., 2004 (http://www.sas.com)

XHTML rendition created by gcapaper Web Publisher v2.0, © 2001-3 Schema Software Inc.