mmxml



This module provides an XML parser and generator for the manipulation of XML documents from Mosel models. To use this module, the following line must be included in the header of the Mosel model file:

 uses 'mmxml'

mmxml relies on the XML parser EXPAT by James Clark (http://www.libexpat.org) for loading documents.

Document representation in mmxml

Data model

The XML document is stored as a list of nodes. Different node types are used to represent the document structure:

In addition to these usual node types, the type DATA is used for XML constructs not supported by mmxml (for instance a DOCTYPE declaration is recorded as a DATA section). Although they are not directly recorded in the document tree, attributes are also stored as nodes of a dedicated type.

Each node is characterised by a name and a value. Nodes of type text, comment, CDATA and DATA have a constant name. The name of a processing instruction is the processing instruction's target and its value the remaining part of the statement (e.g. the name of <?proc inst> is proc and its value is inst). The value of comment and CDATA sections is the content of the section without its delimiters but the value of a DATA block includes the delimiters. Element nodes have also an ordered list of child nodes. The value of an element node corresponds to the value of the first child text node (if any).

The root node is a special element node with no name, no parent and no successor that includes the entire document as its children.

Example of an XML document with node types:

 
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?> XML header
<?xml-stylesheet type="text/css" href="examplestyle.css" ?> Processing instruc.
<!DOCTYPE exampleList SYSTEM "examples.dtd" [ DATA
<!ENTITY otherfile SYSTEM "anotherfile.xml">
]>
<!-- List of optimization application examples --> Comment
<exampleList> Element node
<!-- Example B3 --> Comment
<model id="book_B_3"> Element node
<modFile date="Mar.2002"> Element node
b3jobshop.mos Text node
</modFile>
<modData file="b3jobshop.dat" /> Element node
<modData file="b3jobshop2.dat" /> Element node
<modTitle> Element node
Job shop scheduling Text node
</modTitle>
<modRating> Element node
3 Text node
</modRating>
<modFeatures> Element node
<![CDATA[dynamic array, range, exists, forall-do]]> CDATA
</modFeatures>
</model>
</exampleList>

Paths in a document

Nodes can be retrieved using a path similar to a directory path used to locate a file. An XML path consists in a list of location steps separated by the slash character ("/"): each step selects a set of nodes from the input set resulting from the preceding step (context nodes). The initial set of the path is either the root node (absolute path) or some specified node (relative path).

A step is composed of an optional axis specifier followed by a node test and possibly completed by a predicate. The axis specifies the tree relationship between the nodes selected by the step and the context node. The node test is either an element name (to select elements of the given name) or a node type (to select nodes by their type). The predicate is a Boolean expression the truth value of which decides whether a selected node is kept in the result set of the step.

Examples:

/examples/chapter
all element nodes 'chapter' under elements 'examples'
/examples/chapter/model/modRating[number()>=4]/..
all 'model' nodes under 'examples/chapter' for which element 'modRating' has a value greater than or equal to 4
//*[@attribute1 and @attribute2='value2']
all elements nodes of the document having 'attribute1' defined and 'attribute2' with value 'value2'
/descendant::text()
all text sections of the document
.//mytag
all element nodes named 'mytag' starting from the current node

Axis specifier

An axis specifier consists in an axis name followed the the symbol ::. The supported axes are:

child
children of the context node (this is the default if no axis is given)
parent
parent of the context node
self
the context node itself
attribute
the attributes of the context node
following
following node of the context node
descendant-or-self
the context node as well as all its descendants
descendant
all descendants of the context node

Node test

By default only element nodes are considered, the node test is used to select the nodes by their name. The special name "*" will keep all element nodes. Alternatively, the test can be related to the type of the nodes; in this case all nodes are considered and the test is one of the following expressions:

text()
to select text nodes
comment()
to select comment nodes
cdata()
to select CDATA nodes
data()
to select DATA nodes
processing-instruction()
to select processing instruction nodes
node()
to keep all nodes (independently of the type and name)

Abbreviated notation

Common combinations of axis-node tests have an abbreviated notation. The supported abbrevations are:

.
is equivalent to self::node()
..
is equivalent to parent::node()
//
(used in place of /) is the same as descendant-or-self::node()

Predicate

A predicate is a Boolean expression enclosed in square brakets. The expression evaluator supports Boolean, text and numerical values (encoded as floating point numbers). Type conversions are implicit and implied by the operators: for instance the additive operator "+" operates on numbers, as a consequence its operands are systematically converted to numbers. Constant strings must be quoted using either single or double quotes.

The notation @attname designates the attribute which name is "attname": if used where a Boolean value is expected, it is true if the attribute is defined for the current node. Otherwise, this is the value of the attribute.

Supported arithmetic operators include +, -, *, div (division on floating point numbers, not integral division as in Mosel!), mod (modulo on floating point numbers). Boolean expressions can be composed using and and or; the usual comparators <, <=, >=, >, =, <> (or !=) can be applied to numbers. Note that equality testing (= and <>) is defined for all types. The following predefined functions can also be used in expressions:

name()
name of the node
string()
value of the node
number()
value of the node as a number
boolean()
value of the node as a Boolean
position()
position of the current node in the selected set (first node has position 1)
not(boolexp)
true if 'boolexp' is false, false otherwise
true()
value true
false()
value false
string-length()/getsize()
length of the node value
string-length(strexp)/getsize(strexp)
length of the text passed as parameter
starts-with(strexp1,strexp2)
true if text 'strexp1' starts with text 'strexp2'
contains(strexp1,strexp2)
true if text 'strexp1' contains text 'strexp2'
round(numexp)
rounded value of 'numexp'
floor(numexp)
floor value of 'numexp'
ceiling(numexp)/ceil(numexpr)
ceil value of 'numexp'
abs(numexp)
absolute value of 'numexp'

If the predicate [expr] is not a Boolean value, the whole expression is interpreted as [position()=expr].

New functionality for the Mosel language

The type xmldoc

The type xmldoc represents an XML document stored in the form of a tree. Each node of the tree is identified by a node number (an integer) that is attached to the document (i.e. a node number cannot be shared by different documents and in two different documents the same number represents two different nodes). The root node of the document has number 0: the content of the document is stored as the children of this root node. In addition to structural properties (e.g. name, value, successor, parent) nodes have 2 formatting properties: vertical (setvspace) and horizontal (sethspace) spacing. These indications are used when the document is saved in text form for controling how the resulting text has to be organised (see save). The general formatting policy is defined by a set of document settings: indentation mode (setindentmode), indentation skip (setindentskip) and line length (setlinelen). Also used when exporting the documents are the XML version (setxmlversion), standalone status (setstandalone) and encoding (setencoding). mmxml outputs these three properties without making any use of the information (in particular, no character conversion is performed at the time of saving the document).

Procedures and functions

addnode
Add a node to a document tree.
copynode
Copy a node.
delattr
Delete an attribute of an element node.
delnode
Delete a node in a document tree.
getattr
Get the value of an attribute.
getencoding
Get the character encoding of the document.
getfirstattr
Get the first attribute of an element node.
getfirstchild
Get the first child of an element node.
gethspace
Get horizontal spacing of a node.
getindentmode
Get indent mode of the document.
getindentskip
Get the size of an indentation step.
getlastchild
Get the last child of an element node.
getlinelen
Get the length of a line.
getname
Get the name of a node.
getnext
Get the successor of a node.
getnode
Get the first node returned by a path specification.
getnodes
Get the list of nodes returned by a path specification.
getparent
Get the parent of a node.
getstandalone
Get the standalone flag of the document.
gettype
Get the type of a node.
getvalue
Get the value of a node.
getvspace
Get vertical spacing of a node.
getxmlversion
Get the XML version of the document.
load
Load an XML document.
save
Save an XML document.
setattr
Set the value of an attribute.
setencoding
Set the character encoding of the document.
sethspace
Set horizontal spacing of a node.
setindentmode
Set indent mode for the document.
setindentskip
Set the size of an indentation step.
setlinelen
Set the length of a line.
setname
Set the name of a node.
setstandalone
Set the standalone flag of the document.
setvalue
Set the value of a node.
setvspace
Set vertical spacing of a node.
setxmlversion
Set the XML version of the document.
testattr
Test existence of an attribute for a given element node.
xmldecode
Decode a text string for XML.
xmlencode
Encode a text string for XML.


If you have any comments or suggestions about these pages, please send mail to support@fico.com.

© Copyright 2001-2013 Fair Isaac Corporation. All rights reserved.