StructXML

=Introduction=

structXML is a straightforward RDF serialization in XML format used for internal communications between OSF Web Service Web services, the Flex Semantic Components and OSF-Drupal; this is the core format used to transmit information between any Open Semantic Framework (OSF) component. In OSF Web Service, any data is internally processed as structXML, and is then converted into several other formats (RDF+XML, RDF+N3, structJSON, irJSON, commON, etc.).

A structXML file is composed of a  element which aggregates a series of   (records) that are defined with   and. Values can be "data" values such as literals or "object" values that are reference to other subjects.

structXML is comparable to structJSON, which uses the same structure but that is serialized in JSON instead of XML. Both are based on the OSF Web Service Internal Resultset Structure

=Features=

The structXML format support the following features:


 * 1) Description of subject records
 * 2) Each record have unique identifiers
 * 3) Each record have one or multiple types (belongs to one or multiple classes)
 * 4) Each record can be described with an unlimited number of data or object attributes
 * 5) Reification is supported on object attributes
 * 6) The value of data attributes can be defined with a , or a   tag.

= Specification =

The goal of any Web service is to return results. The root element of any OSF Web Service Web service is the element where all results in a given results document are nested.

Here is an example of a  that has a single   but that has all the features outlined above:

Prefixes are used to shorten URI references. The  elements within a   element are used to shorten all the URI references within the resultset. There is no obligation to shorten a URI reference (we refers to this action as  a URI). Prefixized URIs can appear:


 * 1) In the   or the   attribute of a   element
 * 2) In the   attribute of a   element
 * 3) In the   or the   attribute of an   element
 * 4) In the   attribute of a   element

Each time a structXML parser parse a structXML document, it should try to  any values that appear in one of these attributes.

Prefixes are always introduced with a  character. If we have this prefix defined in a given resultset:

Then all the URIs that uses this namespace will be shortened using that prefix. This means that if we have a URI, the   equivalent string of this URI will be. Both are equivalent in that resultset, but a prefixized URI will be simpler to read for humans and will be shorter to transmit over the web.

A "subject" (consistent with the understanding of subject within the standard subject-predicate-object RDF triple) is a record description returned by a web service endpoint for a given query.

A resultset is composed of one or multiple subject(s) depending on the Web service query. This means that the subject element represents the subject of a query to a Web service endpoint.

Each subject has a  and a   attribute. The type of a subject can be seen as its kind. The URI of a subject is its unique identifier.

A  is what describes a. A predicate can be used to refer a subject to another subject (in this case, we are talking about an "subject predicate" (which is equivalent to an object predicate in RDF)). A predicate can also be used to describe a subject using some literal strings.

Any subject has zero, one or multiple predicate(s) relationships with other objects.

Every predicate has a  attribute. The type of a predicate can be seen as the kind of relationship between two things (a subject and an object).

There are two families of predicates:


 * 1)   (which is equivalent to an object predicate in RDF)
 * 1)   (which is equivalent to an object predicate in RDF)

The  are the ones that refers a  to a   value, or any other textual, types, values such as ,  ,  , etc.

The  are the ones that refers a  to another.

Here is an example of a :

Here is an example of a :

Any predicate refers to one or multiple objects. An object can be a reference to another, or a.

An object has a  and a possible   attribute. The type of an object can be seen as its kind. The URI of an object is its unique identifier. It is optional if the object reference is a literal, such as a string name or a number.

A special kind of object exists:. The characteristics of this kind of object will be discussed in a special section below.

Here is an example of an  value which is a  :

Here is an example of an  value which is a   (a reference to another subject):

Sometimes it is useful to be able to assert facts about a given triple statement. This is what  is about.

The reification example below means: we have a subject that is a. This document has a predicate relationship  with the thing, that itself is a , referred as. Basically, this triple relationship means: "I have a document that is about War".

However we can also assert a certain ratio that shows the confidence level in asserting that statement. By using the  reification property, we can assign a confidence level regarding the "fact" (assertion) of the initial triple statement, as follows:.

This reification gets expressed in the XML data structure as:

The above example shows how an object property value is being reified.

Here is how a datatype property value is being reified in structXML:

So, basically, the reify element helps us to assert a fact about another fact (triple statement). In this sense, then, reification can be seen as a metadata assertion about the original statement.

Data consumers should thus parse the XML document in this following way:

If there is a  element within the body of a   element, the data consumer must check the three parent nodes of the   element to compose the assertion fact about   comprising the three nodes of the triple.

Unique Identifiers: URIs
Nearly all resources and their associated subject, predicate or object have a unique identifier called a URI. (Subjects and predicates must have a URI; objects most frequently do, but sometimes may optionally be assigned a literal.)

These URIs are unique to each resource. Since these IDs are unique, if a Web service A refers to a resource X and another Web service B also refers to a resource X, then both Web services A and B refers to the same thing. This understanding must hold true for the reason that atomic Web services can easily interact together to create compound Web services.

However, sometimes, the subjects or the objects of a resultset may not have a defined URI (the attribute). If such a case happens, the consumer of this Web service data must itself define a unique identifier for that thing.

Literals and Datatype Values
A literal is a special kind of object. Unlike any other object, a literal object can not be a subject of a predicate. (Technically, a resource could describe a literal, but the literal itself can't be described; but this fact is out of the scope of this document).

A literal object does not have a  attribute.

Optionally, a literal object can have a  and/or a   attribute.

A literal value can be further defined using any defined   type. We can say that a literal value is not only a literal, but more precisely an integer. Here is an example of such a typed literal:

Here is the list of the most commonly used XSD datatypes:

Additionally, a literal string can be defined using a language identifier. Such a language identifier is used to specify what human language has been used to write the string value. Here is an example of such a value:

With this example, we specify that the string "language test" has been written in English. The language tag used in the  attribute are the ones suggested in the RFC 4646: the 2 charactes ISO639-1 language codes:

Flexibility of this XML Data Structure
This XML data structure is thus flexible enough to describe any relation within an RDF graph produced by a OSF Web Service.

The advantage of re-using the triple assertions of the RDF data model with types and URIs is that a data consumer can easily handle the data produced by any Web service, even without knowing the type of the subjects, predicates and objects returned by that Web service. The data consumer can always say: I have this thing that refers to this other thing with this given predicate. The data consumer can manipulate results in some ways even if it doesn't know much or anything about the types of those things.

This consistent abstraction is helpful since even if the Web services evolve and change over time, the data consumers of these Web services will be able to handle the things it knows, and only discard the new types that have been added that it may not know, all without having to change anything in the procedures that manage the resultsets returned by these Web services.