XML Toolbox

 

Introduction

The XML Toolbox for Jython allows users to convert and store variables and structures from the Jython workspace into the plain text XML format, and from this XML format to Jython. This XML format can be used to store parameter structures, variables and results from engineering applications in non-proprietary files, or XML-capable databases, and can be used for the transfer of data across the Grid. The toolbox contains bi-directional conversion routines implemented as four small intuitive and easy-to-use Python functions. As an additional feature, this toolbox allows the transparent transfer of data from the Jython scripting environment to the Matlab Problem Solving Environment and vice versa.

 

 

The size of data structures the XML Toolbox can deal with is only limited by the available memory; as an indication, 60MB large data structures can be easily converted on a 256MB PC running Jython.

 

xml_format()

Converts Jython data to an XML string

xml_parse()

Converts an XML string into Jython data

xml_load()

Loads an XML file and returns Jython data

xml_save()

Saves Jython data into an XML file

gd_help()

Displays help for the xml_* functions

Table 5 XML Toolbox functions

 

 


Tutorial

The XML Toolbox for Jython can be used independently of the Compute and Database Toolboxes. No proxy certificate is required to make use of its functionality. Two jar archives (which are included in the lib subdirectory) are required for it to work: jdom.jar and jnumeric-0.1a3.jar (or a similar version).

 

Getting started

Before using the Geodise XML Toolbox in the Jython environment, you need to import the gdxml module. There is an example in the demo_xml.py script.

Use either

(1)           >>> import gdxml

and then, for example, to call xml_format() use:

>>> gdxml.xml_format(…)

or

(2)           >>> from gdxml import xml_format

and then, for example, to call xml_format() use:

>>> xml_format(…)

or

(3)           >>> from gdxml import *

and then, for example, to call xml_format() use:

>>> xml_format(…)

 

Converting Python data types to XML

All common Python built-in data types can be converted into XML (with or without data type attributes) with the simple-to-use command xml_format. We highlight the differences in XML output structure in the following three examples.

 

>>> v = {}

>>> v['a'] = 1.2345

>>> v['b'] = 'This is a string.'

>>> v['c'] = ('alpha','beta')

>>> v['d'] = 12345

>>> v['e'] = {'sub' : {'subsub' : 'subsubsub'}}

 

This first example shows the formatting of the variable v with no additional input parameters specified. The XML is formatted in such a way that any subsequent parsing of the created XML string with xml_parse reconstructs an exact copy of the original Jython data structure.

 

>>> xmlstr = xml_format(v)

>>> print xmlstr

 

<root idx="1" type="struct" size="1 1" xml_tb_version="1.0-py">

  <b idx="1" type="char" size="1 17">This is a string.</b>

  <a idx="1" type="double" size="1 1">1.2345</a>

  <e idx="1" type="struct" size="1 1">

    <sub idx="1" type="struct" size="1 1">

      <subsub idx="1" type="char" size="1 9">subsubsub</subsub>

    </sub>

  </e>

  <d idx="1" type="integer" size="1 1">12345</d>

  <c idx="1" type="cell" size="1 2">

    <item idx="1" type="char" size="1 5">alpha</item>

    <item idx="2" type="char" size="1 4">beta</item>

  </c>

</root>

 

The XML attributes idx, type and size, which allow the exact reconstruction of the data types in Jython, can be turned off by specifying the second parameter in the xml_format function call as 'off'. This results in a more generic formatting of the structure, however, the XML contents are now interpreted purely as strings when parsed back into Jython as type and size information are lost:

 

>>> xmlstr = xml_format(v,'off')

>>> print xmlstr

 

<root>

  <b>This is a string.</b>

  <a>1.2345</a>

  <e>

    <sub>

      <subsub>subsubsub</subsub>

    </sub>

  </e>

  <d>12345</d>

  <c>

    <item>alpha</item>

    <item>beta</item>

  </c>

</root>

 

The user can write the XML representation of a Jython variable immediately into an XML file using the command xml_save. This command uses the same XML format as the function xml_format.

 

Converting XML to Python data types

As XML can contain any arbitrary contents as long as they follow the W3C XML Recommendation (www.w3.org), parsing and translating of these constructs into a Jython-specific environment can be complex. The function xml_parse allows the conversion of XML strings in two ways into Jython data structures. These correspond to the techniques shown above for xml_format with and without the XML attributes.

If the XML contains specific type attributes, such as created by xml_format with attributes switched on (i.e. the idx, type, size attributes), the XML Toolbox will be able to re-create the Python data type and content described by the XML string.

 

For example,

 

xmlstr = '<root idx="1" type="integer" size="1 1">42</root>'

 

can be parsed using the command

>>> v = xml_parse(xmlstr)

 

and returns the variable

>>> print v

 

42

 

 

As a more complex example,

 

# Paste this assignment into a Python script and run it:

 

xmlstr = """

<root xml_tb_version="3.1" idx="1" type="struct" size="1 1">

  <a idx="1" type="double" size="1 1">1.2345</a>

  <b idx="1" type="double" size="2 4">1 5 2 6 3 7 4 8</b>

  <c idx="1" type="char" size="1 17">This is a string.</c>

  <d idx="1" type="cell" size="1 2">

    <item idx="1" type="char" size="1 5">alpha</item>

    <item idx="2" type="char" size="1 4">beta</item>

  </d>

  <e idx="1" type="boolean" size="1 1">0</e>

  <f idx="1" type="struct" size="1 1">

    <sub1 idx="1" type="struct" size="1 1">

      <subsub1 idx="1" type="double" size="1 1">1</subsub1>

      <subsub2 idx="1" type="double" size="1 1">2</subsub2>

    </sub1>

  </f>

  <g idx="1" type="struct" size="1 2">

    <aa idx="1" type="cell" size="1 2">

      <item idx="1" type="char" size="1 5">g1aa1</item>

      <item idx="2" type="char" size="1 5">g1aa2</item>

    </aa>

    <aa idx="2" type="cell" size="1 1">

      <item idx="1" type="char" size="1 5">g2aa1</item>

    </aa>

  </g>

</root>

"""

 

can be parsed using the command

>>> v = xml_parse(xmlstr)

and returns the (immutable, non-sorted) data structure

>>> print v

 

{'b': [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]],

 'a': 1.2345,

 'g': [{'aa': ['g1aa1', 'g1aa2']}, {'aa': ['g2aa1']}],

 'f': {'sub1': {'subsub1': 1.0, 'subsub2': 2.0}},

 'e': 0,

 'd': ['alpha', 'beta'],

 'c': 'This is a string.'}

which corresponds exactly to the Jython variable used in xml_format to create the XML string.

 

If we use the same command, xml_parse, but tell the parser to ignore the attributes with the command

 

>>> v_wo_att = xml_parse(xmlstr,'off')

>>> print v_wo_att

 

we obtain a dictionary where types and sizes of the data will not be adapted to match standard Python data types, that means that all alphanumeric content will be returned as strings.

 

{'b': '1 5 2 6 3 7 4 8',

 'a': '1.2345',

 'g': [{'aa': 'g1aa2'}, {'aa': 'g2aa1'}],

 'f': {'sub1': {'subsub1': '1', 'subsub2': '2'}},

 'e': '0',

 'd': 'beta',

 'c': 'This is a string.'}

 

The structural information (in fields f and g) is still preserved, although nested list contents, such as in field b, and numeric values, such as in fields a and e, are returned as pure strings.


 



gd_unmarkfordeletion

contents

xml_format

Copyright © 2005, The Geodise Project, University of Southampton