Data Validator Tool: Validating Dataset Content Based On Ontology Descriptions

From OSF Wiki
Jump to: navigation, search

Contents

Introduction

The Data Validator Tool (DVT) is a command line tool that is used to validate data records indexed in OSF datasets according to the description of loaded ontologies. Depending on how the ontologies are described, the DVT will validate the content of the datasets and report possible issues. The DVT is a post-indexation validation mechanism. It doesn't enforce any data validation at indexation time. It does report validation issues when the DVT is run against OSF. Once validation errors are detected, different mechanisms have to be put in place to fix these issues.

This document explains how the DVT should be used. It also explains how the current data validation tests works and how the reported errors should be interpreted. It also explains how the ontologies should be described, using the Protégé ontology editor, to better specify the ontologies in order to take full advantage of the DVT validation tests.

Installation & Configuration

All the installation & configuration steps are directly available on the Data Validator Tool page.

Command Line

Using the DVT command line tool is pretty easy. Its command line options and parameters are:

Usage: dvt [OPTIONS]
Usage examples:
    Validate data: dvt -v
Options:
--output-xml="[PATH]"                 Output the validation reports in a file specified
                                      by the path in XML format.
--output-json="[PATH]"                Output the validation reports in a file specified
                                      by the path in JSON format.
--allocated-memory="M"                Specifies the number of Mb of memory allocated to the DVT
                                      The number of Mb should be specified in this parameter
-v                                    Run all the data validation tests
-s                                    Silent. Do not output anything to the shell.
-f, --fix                             Tries to automatically fix a validation test that fails
                                      Note: not all checks support this option
-h, --help                            Show this help section

Let's take a deeper look into each of these parameters. Note that any parameter can be used with any other parameter.

  1. -v
    • If you specify this parameter, then you will start the validation process. If you don't specify it, then no validation will be performed by the DVT
  2. -s
    • If you specify this parameter, then nothing will be outputted to the shell terminal. This is usually used when an external tool performed automated validation using the DVT
  3. -f, --fix
    • If you specify this parameter, then you are asking each of the validation test to try to automatically fix the tests that failed. This option is not supported by all validation checks, so only the ones that support that option will try to fix the validation issues.
  4. -h, --help
    • If you specify this parameter, then this help will be output to the shell terminal
  5. --output-xml="[PATH]"
    • If you specify this parameter, then all the tests, warnings and errors will be written into a XML file, as specified by the [PATH] value. Make sure that the user that runs the DVT do have write permission on the specified PATH. This is normally used to log validation tests
  6. --output-json="[PATH]"
    • If you specify this parameter, then all the tests, warnings and errors will be written into a JSON file, as specified by the [PATH] value. Make sure that the user that runs the DVT do have write permission on the specified PATH. This is normally used to log validation tests
  7. --allocated-memory="M"
    • If you specify this parameter, then the amount of memory specified will be used by the DVT to run the tests. Depending on the size of the datasets and the tests defined within the ontologies, more memory may be required by the DVT to work normally

Here are a few command line examples for using the DVT:

# Basic DVT usage
dvt -v
 
# Basic DVT usage that specify a certain amount of memory to use. This allocates 514 Mb of memory for DVT
dvt -v --allocated-memory="512"
 
# Specifying a XML file where to log all the errors and warnings. Things are still displayed in the terminal
dvt -v --output-xml="/tmp/dvt_validation_checks_log.xml" --allocated-memory="512"
 
# Specifying a XML and a JSON file where to log all the errors and warnings. Things are still displayed in the terminal
dvt -v --output-xml="/tmp/dvt_validation_checks_log.xml" --output-json="/tmp/dvt_validation_checks_log.json" --allocated-memory="512"
 
# Specifying that we don't want anything, anymore, displayed in the terminal
dvt -v -s --output-xml="/tmp/dvt_validation_checks_log.xml" --output-json="/tmp/dvt_validation_checks_log.json" --allocated-memory="512"

Automatic Validation Error Fixing

Some of the validation check procedure does support the automatic error fixing command line option. If the check does support that option, then it will run an internal procedure to try to fix the validation error itself. Be careful to read the "Automatic Validation Error Fixing" section for each of the test to see how the validation errors get fixed.

When a validation error get fixed, it means that the description of the record that failed the validation test will get modified such that the test doesn't fails again. All the automatic validation errors fixing procedures uses the CRUD: Update web service endpoint, and does specify that a revision need to be created for that updated record. What that means is that all the records that get modified by one of the validation procedure will get revisioned, which means that all the fix changes can be roll-backed using the Revision: Update web service endpoint.

Finally, all fixes are recorded into the log file if the --output-xml or the --output-json command line options have been specified for the DVT command.

Data Validation Tests

Overview

The DVT includes a series of data validation tests that can be used to test the completeness and consistency of instance records indexed in OSF. If a test fails for a given record, then the error will be reported, explained and logged depending of the DVT parameters that have been specified. These validation tests cover the most common data validation usecases. A test can be used in different ways to validate different things within the dataset. Each of these ways to define a test is explained below within each test description.

In this section, the tests are introduced. Then, a description of the way the test works is provided. If some more technical background is required, then a specific section calling this out follows. A section also explains the different ways you can define ontologies and the impacts on that test. Finally an explanation of how the reported errors and warnings should be interpreted is provided.

URI Existence Validation

Introduction

The URI Existence Validation test is used to if the referenced URIs exist within OSF or not. If a record references an undeclared record (because of a missing URI), then an error will be reported.

How it Works

This test gets the list of all records that are referenced by other records but that are not (currently?) defined into OSF. For each of these undefined records, an error will be returned.

It checks for all the values of all the triples at the exception of the rdf:type property. This means that all the triples were rdf:type is the predicate of the triple will be ignored by this test.

Technical Explanation

In RDF, everything is a triple. A triple is a 3-tuple of the form: <subject> <predicate> <object>bject>. Every record is described by one or more of these triples. The <subject> is the record being described. The <predicate>icate> is a property/predicate/attribute of that record. The <object> is the value of a property.

In RDF, the <object> can loosely be one of two things:

  1. a Literal value
  2. a reference to another record (URI)

What the URI Existence Validation test does is to get the complete list of all the <object> which are reference to another record (URIs). Then, once this list is compiled, the test validates that the URI references are described in OSF, in the same, or another dataset/ontology. This heuristic has been implemented as a SPARQL query that is used internally.

Automatic Validation Error Fixing

If the -f parameter is specified for a DVT command, then the URI Existence Validation test will try to fix all the validation errors that occurred. The fix that will be applied is that the triple where the value is a URI which is not existing in any other dataset, or any other ontologies, will be deleted in the dataset.

However, the DVT uses the Revisioning capabilities of OSF when it does the automatic fixing of errors. This means that it will always be possible to revert changes performed by the DVT by using the revisioning web service endpoints.

Fixing Exceptions

There is one kind of triple that cannot be fixed by this check. If the predicate of a value that is not existing is rdf:type, then this triple won't be fixed. It will be reported to the user interface and in the XML or the JSON logs, but it won't be fixed.

The reason why it won't be fixed is simple, it is because if we remove the rdf:type associated with a record, then we will untype that record unnecessarily. What we do is to report the issue such that the data maintainers does fix the type by hands, or does create the class, representing that type, into one of the loaded ontologies.

Logging Error Fixes

All the fixes are logged into the XML or JSON log files if the --output-xml and/or the --output-json options were specified in the DVT command. In this section we will explain how to interpret the log files specifically for the fixes reported in the logs for that URI Existence Validation check.

XML Logs Files

Here is the explanation for the meaning of each element of that file:

<fixes /> If the validation test does support the -f (fix) parameter, then the <fixes /> element will be populated with all the data that got fixed by the validation test procedure
<fix /> A particular thing that got fixed by the validation test procedure. See each check section to see how this <fix /> element is populated by a particular procedure.
<dataset /> Dataset URI where the triple is located
<subject /> Record URI that is affected by this validation test
<predicate /> Predicate URI that lead to the validation error
<object /> Unexisting record URI that is referenced by the <predicate />

Here is an example of such a (partial) XML log file that includes the fixes reports:

<?xml version="1.0"?>
<checks>
  <check>
    <name>URI Usage Existence Check</name>
    <fixes>
      <fix>
        <dataset>http://localhost/datasets/documents/</dataset>
        <subject>http://localhost/datasets/documents/26765</subject>
        <predicate>http://purl.org/dc/terms/subject</predicate>
        <object>http://purl.org/ontology/muni#incident_monitoring</object>
      </fix>
  </check>
</checks>

JSON Logs Files

Here is the explanation for the meaning of each element of that file:

Element Description
"fixes": [{}] If the validation test does support the -f (fix) parameter, then the <fixes /> element will be populated with all the data that got fixed by the validation test procedure
"dataset": "" Dataset URI where the triple is located
"subject": "" Record URI that is affected by this validation test
"predicate": "" Predicate URI that lead to the validation error
"object": "" Unexisting record URI that is referenced by the <predicate />

Here is an example of such a (partial) JSON log file that includes the fixes reports:

{
    "checks": [
        {
            "name": "URI Usage Existence Check",          
 
            "fixes": [
            {
              "dataset": "http://localhost/datasets/documents/",
              "subject": "http://localhost/datasets/documents/18106",
              "predicate": "http://purl.org/dc/terms/subject",
              "object": "http://purl.org/ontology/muni#incident_monitoring"
            }
        }
    ]
}

Errors & Warnings

Errors
URI-EXISTENCE-100
Description This error is returned when a URI if used as an <object> reference but that is not currently defined in any dataset accessible by the DVT. This means that an "undefined" URI has been referenced by another record within the datasets.
Fields
Field Description
uri This element refers to the URI that is being used as an <object> reference but that is not currently defined in any dataset accessible by the DVT
Warnings
URI-EXISTENCE-50
Description This warning is returned when the test couldn't check if referenced URIs exists in the OSF instance. This means that the SPARQL query failed to execute the query.
Fields No additional fields


URI-EXISTENCE-51
Description We couldn't get the list of affected records from the OSF instance.
Fields No additional fields


URI-EXISTENCE-52
Description We couldn't read the description of an affected record from the OSF instance.
Fields No additional fields


URI-EXISTENCE-53
Description We couldn't update the description of an affected record from the OSF instance
Fields No additional fields

Defined Field Type Validation

Introduction

The Defined Field Type Validation test is used to see if the properties used to define the content of the dataset have drupal:fieldType defined for each of them in any ontology. This check is normally used if you are operating a OSF for Drupal website with OSF. It will tell you if you are missing some descriptions that have an impact on the Field Type that will be used in Drupal when you will perform the OSF Entities mapping process.

How it Works

This test list all the properties used to define things in your datasets and then check if the drupal:fieldType annotation property is used to describe these properties in any loaded ontologies.


Errors & Warnings

Errors
FIELDTYPE-DEFINED-100
Description This error is returned when a property is used to describe records in the datasets but that is not defined with drupal:fieldType in any loaded ontologies.
Fields
Field Description
property This element refers to the URI of the property that is not defined
Warnings
FIELDTYPE-DEFINED-50
Description This warning is returned when the test couldn't check if properties are defined in the OSF instance. This means that the SPARQL query failed to execute the query.
Fields No additional fields


Defined Properties Validation

Introduction

The Defined Properties Validation test is used to see if the properties used to define the content of the dataset is defined in any ontology. If a record uses an undefined property, then an error will be reported.

How it Works

This test gets the list all the properties that are being used, and check if they are defined in any ontology.


Errors & Warnings

Errors
PROPERTIES-DEFINED-100
Description This error is returned when a property is used to describe records in the datasets but that is not defined in any ontology.
Fields
Field Description
property This element refers to the URI of the property that is not defined
Warnings
PROPERTIES-DEFINED-50
Description This warning is returned when the test couldn't check if properties are defined in the OSF instance. This means that the SPARQL query failed to execute the query.
Fields No additional fields

Defined Classes Validation

Introduction

The Defined Classes Validation test is used to see if the classes used to define the content of the dataset is defined in any ontology. If a record uses an undefined class, then an error will be reported.

How it Works

This test gets the list all the classes that are being used, and check if they are defined in any ontology.


Errors & Warnings

Errors
CLASSES-DEFINED-100
Description This error is returned when a class is used to describe records in the datasets but that is not defined in any ontology.
Fields
Field Description
class This element refers to the URI of the class that is not defined
Warnings
CLASSES-DEFINED-50
Description This warning is returned when the test couldn't check if classes are defined in the OSF instance. This means that the SPARQL query failed to execute the query.
Fields No additional fields

Property Validation

Properties, the middle part of an RDF triple, may be one of three kinds: 1) datatype properties, for which the object is a value that conforms to a specific type of data type; 2) object properties, for which the object is another instance denoted by a URI; or 3) an annotation property, where the object is a literal (string) value. Both datatype and object properties may be further defined using the concepts of domain and range, as described below. Annotation properties do not have domains or ranges. This section describes how the DVT validates against domain and range.

Datatype Property Datatype Validation

Introduction

The Datatype Property Datatype Validation test is to check if all of the datatypes defined for all used datatype properties have been respected and are valid. With this test, we make sure that all the expected value types have been respected when indexed into OSF.

This test has two modes:

  1. loose - if this mode is used for this test, then the checks only rely on the datatype defined in the ontologies to perform the validation check test.
  2. strict - if this mode is used then the test will also check to make sure that the datatype returned by the data store is the same as the one defined in the ontology

By default, the strict mode is used by that test. If you want this test to use the loose mode, then edit the dvt.ini file and append the ?mode=loose string to the end of the test's configuration line such as:

  • checks[] = "StructuredDynamics\osf\validator\checks\CheckDatatypePropertiesDatatype?mode=loose"

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the properties that have a non-URI value and that have a range defined for them in one of the loaded ontologies
    1. For each datatype property we get the list of all the values. At this step, we will have two pieces of information about the value. We will have the actual textual value, and the datatype of that value as defined in the triple store.
      1. For each value we make sure that the datatype defined for that value in the triple store is the same as the one defined in the ontology
        1. If the value's defined datatype is the same as the one defined in the ontology, then we validate the actual value according to internal XSD and RDFS data validation internal procedures
          1. If the actual value is not valid according to these internal validation tests, we return a DATATYPE-PROPERTIES-DATATYPE-101 error
        2. If the value's defined datatype is not the same as the one defined in the ontology, we return a DATATYPE-PROPERTIES-DATATYPE-100 error

Notes regarding this heuristic:

  1. If no range is defined for a property, then its range is considered "rdfs:Literal", which means that no specific datatype is defined for the value, and that any value can be used as a value of this property.
  2. Even if a value is defined as xsd:token in the triple store, it doesn't mean that the value is actually a valid xsd:token since the triple store won't validate according to this datatype, but will only tag the value as being of that type. So this is why we have to perform the test 1.1.1.1

Technical Explanation

In RDF, everything is a triple. A triple is a 3-tuple of the form: <subject> <predicate> <object>. Every record is described by one or more of these triples. The <subject> is the record being described. The <predicate>icate> is a property/predicate/attribute of that record. The <object> is the value of a property.

OWL is a specification framework that is used to create the ontologies that are used to define the semantics of the properties/predicates/attributes and the types/classes used to describe the instance records indexed in OSF datasets.

When we define a <predicate> in an ontology, each predicate may have at least two different characteristics:

  1. It may have a domain
  2. It may have a range

The domain of a property is the left side of the property. What the domain does is to specify where the <predicate> can be used, which type/kind of <subject> it can be used to describe. That is, the domain for a given property defines valid subject types to which it applies. If a <subject> type is not in the domain of a property, then that property cannot be used to describe that type of <subject>.

The range of a property is the right side of the property. What the range does is to specify the datatype of the value (<object>) of such a <property>. That is, the range for a given property defines valid object types to which it can apply. For example, if we have a foo:lastModified property where the range of that property is xsd:dateTime, then it means that all the instance records that uses this foo:lastModified property need to have a value of type xsd:dateTime.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded in OSF have to be properly defined. If no datatypes are defined for any property, then the test will consider that their default datatype is rdfs:Literal which is equivalent to say that any value can be entered for each of the properties. Otherwise, any datatype specified into any loaded ontology will have a direct impact on this test.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Data Properties". This is the tab where all the datatype properties defined in the ontology will appear. If you click on any of these datatype properties that appears on the left side of the application, you will see the property's complete description appearing on the right side of the application.

There is one section that is highlighted on the right side section that is of interest for this test, which is the Range section. This is where the range of a property is defined in Protégé. There are 3 buttons related to such a range that interest us particularly:

  • "+" – The add button is used to add a new datatype range to the property
  • "o" – The edit button is used to edit the current datatype range assignation of the property
  • "x" – The remove button is used to remove a datatype range assignation of the property
Protege datatype property.png
To add a new datatype to a given property, you have to click the "+" button. When clicked, a list of available datatypes will then appear. From that list, you choose the datatype you want to specify for this property and click the "OK" button.
Protege datatype.png
Once you added/modified/removed a datatype assignation to a property, you have to reload the ontology in OSF to have the modification taken into account by the DVT.
Note: This validation test only supports unique range definitions for datatype properties. This means that you should not define more than one range for a given property within your ontology.


Supported Datatypes

This validation test does perform additional internal data validation procedure to make sure that the value is a valid value according to the specified datatype. Here is a list of all the supported datatypes:

  1. xsd:anyURI
  2. xsd:base64Binary
  3. xsd:boolean
  4. xsd:byte
  5. xsd:dateTime
  6. xsd:dateTimeStamp
  7. xsd:decimal
  8. xsd:double
  9. xsd:float
  10. xsd:hexBinary
  11. xsd:int
  12. xsd:integer
  13. xsd:language
  14. xsd:long
  15. xsd:Name
  16. xsd:NCName
  17. xsd:negativeInteger
  18. xsd:NMTOKEN
  19. xsd:nonNegativeInteger
  20. xsd:nonPositiveInteger
  21. xsd:normalizedString
  22. xsd:positiveInteger
  23. xsd:short
  24. xsd:string
  25. xsd:token
  26. xsd:unsignedByte
  27. xsd:unsignedInt
  28. xsd:unsignedLong
  29. xsd:unsignedShort
  30. rdfs:Literal
  31. rdf:PlainLiteral
  32. rdf:XMLLiteral

Errors & Warnings

Errors
DATATYPE-PROPERTIES-DATATYPE-100
Description This error is returned when the datatype specified in the triple store and the range specified in the ontology for that property are different.
Fields
Field Description
datatypeProperty The datatype property URI that raised the validation error
expectedDatatype The expected datatype for that property. This is the expected datatype as defined in the ontology.
valueDatatype The datatype of the value as specified into the triple store
value The actual value string that is indexed into the triple store
affectedRecord A list of all the records URI that are affected by this validation error (can be extensive)
DATATYPE-PROPERTIES-DATATYPE-101
Description This error is returned when the datatype specified in the triple store and the range specified in the ontology for that property are the same, but when the actual indexed value is invalid according to the internal datatype validation procedures.
Fields
Field Description
datatypeProperty The datatype property URI that raised the validation error
expectedDatatype The expected datatype for that property. This is the expected datatype as defined in the ontology.
invalidValue The actual (invalid) value string that is indexed into the triple store
affectedRecord A list of all the records URI that are affected by this validation error (can be extensive)
Warnings
DATATYPE-PROPERTIES-DATATYPE-50
Description This warning is returned when a datatype property is being used, but for which we don't have any range defined for it in any loaded ontologies. No immediate actions are required when this warning is sent, but they show areas where the ontologies may be updated/improved.
Fields


Field Description
datatypeProperty The datatype property URI that raised the validatio


DATATYPE-PROPERTIES-DATATYPE-51
Description This warning is returned when we couldn't get the list of datatype properties from the OSF instance. The SPARQL query failed in some way.
Fields No additional fields


DATATYPE-PROPERTIES-DATATYPE-52
Description This warning is returned when we couldn't get the list of values for a specific property
Fields No additional fields

Object & Datatype Property Domain Validation

Introduction

The Object & Datatype Property Domain Validation test is to check if all the properties are used to describe the proper instance records currently indexed in OSF as defined into the loaded ontologies. Not all the properties can be used to describe all the type of instance records, so this test make sure that all the properties have been used to define the proper type of instance records.

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the properties that are used to describe any record within OSF
    1. For each of the property we get the list of all the distinct types of all the records that uses this property.
      1. For each type we make sure that the type belongs to the domain defined for this property in the loaded ontologies
        1. If the the type of one of the record doesn't belong to the domain of the property as described in the ontologies, then a OBJECT-DATATYPE-PROPERTIES-DOMAIN-100 error will be returned

Notes regarding this heuristic:

  1. If no domain is defined for a property, than its domain is considered "owl:Thing" which means that any type of instance records can use this property

Technical Explanation

In RDF, everything is a triple. A triple is a 3-tuple of the form: <subject> <predicate> <object>bject>. Every record is described by one or more of these triples. The <subject> is the record being described. The <predicate>icate> is a property/predicate/attribute of that record. The <object> is the value of a property.

OWL is a specification framework that is used to create the ontologies that are used to define the semantic of the properties/predicates/attributes and the types/classes used to describe the instance records indexed in OSF datasets.

When we define a <predicate>icate> in an ontology, each predicate have at least two different characteristics:

  1. It does have a domain
  2. It does have a range

The domain of a property is the left side of the property. What the domain does is to specify where the <predicate>icate> can be used, which type/kind of <subject> it can be used to describe. If a <subject> type is not in the domain of a property, then that property cannot be used to describe that type of <subject>.

The range of a property, is the right side of the property. What the range does is to specify the datatype of the value (<object>) of such a <property>. For example, if we have a foo:lastModified property where the range of that property is xsd:dateTime, then it means that all the instance records that uses this foo:lastModified property need to have a value of type xsd:dateTime.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded in OSF have to be properly defined. If no domains are defined for any property, then the test will consider that their default domains is owl:Thing which is equivalent to say that any property can be used to define any type of instance record. Otherwise, any domain specified into any loaded ontology will have a direct impact on this test.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Object Properties" and another one which is called "Datatype Properties". These are the tabs where all the object and datatype properties are defined in the ontology will appear. If you click on any of these properties that appears on the left side of the application, you will see the property's complete description appearing on the right side of the application.

Note that the following explanations are the same for the object, or the datatype properties sections. However, the current example is based on the "Object Properties" tab.

There is one section that is highlighted on the right side section that is of interest for this test, which is the Domain section. This is where the domain of a property is defined in Protégé. There are 3 buttons related to such a domain that interest us particularly:

  • "+" – The add button is used to add a new domain to the property
  • "o" – The edit button is used to edit the current domain assignation of the property
  • "x" – The remove button is used to remove a domain assignation of the property
Protege object property domain.png
To add a new domain to a given property, you have to click the "+"

button. When clicked, a list of available domain types will appear under

the Class hierarchy tab. From that list, you choose the type (class) you want to specify for this property and click the "OK" button.
Protege object property domain selection.png
Once you add/modify/remove a range assignation to a property, you have

to reload the ontology in OSF to have the modification taken into account by the DVT.

Note: This validation test only supports unique domain definition for properties. This means that you shouldn't define more than one domain for a given property within your ontology.


Errors & Warnings

Errors
OBJECT-DATATYPE-PROPERTIES-DOMAIN-100
Description This error is returned when the type of a record is not part of the domain of a property used to describe the record.
Fields
Field Description
property The property (datatype or object) URI that raised the validation error
definedDomain The domain currently defined for that property in the loaded ontologies
type Type of records where the property is used but that doesn't belong to the domain of the property
typeSuperTypes All the super types of that types. This additional information can be used for debugging/fixing purposes.
affectedRecord A list of all the records URI that are affected by this validation error (can be extensive)
Warnings
OBJECT-DATATYPE-PROPERTIES-DOMAIN-50
Description This warning is returned when a property is being used, but for which we don't have any domain

defined for it in any loaded ontologies. No immediate action is required when this warning is sent, but the error shows areas where the ontologies may be updated/improved.

Fields
Field Description
property The property (datatype or object) URI that raised the validation error


OBJECT-DATATYPE-PROPERTIES-DOMAIN-51
Description This warning is returned when the test couldn't get the list of super

properties of a given type from OSF. This means that the Ontology: Read query failed in some way.

Fields
Field Description
property The property (datatype or object) URI for which we failed to get its super properties.


OBJECT-DATATYPE-PROPERTIES-DOMAIN-52
Description This warning is returned when we couldn't find the ontology where a specific type has been defined
Fields No additional fields


OBJECT-DATATYPE-PROPERTIES-DOMAIN-53
Description This warning is returned when we couldn't get the list of available datatype and object properties. This means that the SPARQL query failed in some way.
Fields No additional fields


OBJECT-DATATYPE-PROPERTIES-DOMAIN-54
Description This warning is returned when we couldn't get the list of available datatype and object properties. This means that the SPARQL query failed in some way.
Fields No additional fields


OBJECT-DATATYPE-PROPERTIES-DOMAIN-55
Description This warning is returned when we couldn't get the list of affected records by an error that got raised. This means that the SPARQL query failed in some way.
Fields No additional fields

Object Property Range Validation

Introduction

The Object Property Range Validation test is to check if all the ranges of the object properties have been respected in OSF. With this test, we make sure that every time that an object property is used to describe a record, that it references a valid record. For example, if we have a foo:parner object property that has a range of foo:Partner, then we make sure that every time the property is used, that it refers to a record of type foo:Partner. If it is not the case, then errors are reported.

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the object properties that are used to describe records
    1. For each of the object property we get the list of all the values (URIs of referenced records)
      1. For each URI we make sure that the type of the referenced record comply with the range defined for that property defined in the loaded ontologies. This check is performed using inference
        1. If the type of the URI is not part of the range of the property, then an error OBJECT-PROPERTIES-RANGE-100 is reported

Notes regarding this heuristic:

  1. If no range is defined for an object property, than its range is considered "owl:Thing" which means that any type of record can be referenced by that property.
  2. The range validation check is using inference. This means that if we have a foo:ExternalPage page record that uses a property foo:Partner property which is defined such that its range is foo:Document, then this triple will be valid since the foo:ExternalPage is a sub-class-of foo:Document.

Technical Explanation

In RDF, everything is a triple. A triple is a 3-tuple of the form: <subject> <predicate> <object>bject>. Every record is described by one or more of these triples. The <subject> is the record being described. The <predicate>icate> is a property/predicate/attribute of that record. The <object> is the value of a property.

OWL is a specification framework that is used to create the ontologies that are used to define the semantics of the properties/predicates/attributes and the types/classes used to describe the instance records indexed in OSF datasets.

When we define a <predicate>icate> in an ontology, each predicate have at least two different characteristics:

  1. It may have a domain
  2. It may have a range

The domain of a property is the left side of the property. What the domain does is to specify where the <predicate>icate> can be used, which type/kind of <subject> it can be used to describe. If a <subject> type is not in the domain of a property, then that property cannot be used to describe that type of <subject>.

The range of a property, is the right side of the property. What the range does is to specify the datatype of the value (<object>) of such a <property>. For example, if we have a foo:partner object property where the range of that property is foo:Document, then it means that all the instance records that uses this foo:partner property need to reference an instance record of type foo:Partner.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded IN OSF have to be properly defined. If no range are defined for any object property, then the test will consider that their default range is owl:Thing which is equivalent to say that this object property can be used to reference any type of instance records. Otherwise, any range specified for any object property into any loaded ontology will have a direct impact on this test.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Object Properties". This is the tab where all the object properties defined in the ontology will appear. If you click on any of these object properties that appears on the left side of the application, you will see the property's complete description appearing on the right side of the application.

There is one section that is highlighted on the right side section that is of interest for this test, which is the Range section. This is where the range of a property is defined in Protégé. There are 3 buttons related to such a range that interest us particularly:

  • "+" – The add button is used to add a new range to the object property
  • "o" – The edit button is used to edit the current range assignation of the object property
  • "x" – The remove button is used to remove a range assignation of the object property
Protege datatype property.png
To add a new class to the range of a given property, you have to click

the "+" button. When clicked, a list of available classes will then appear. From that list, you choose the class you want to specify for

this object property and click the "OK" button.
Protege object property range selection.png
Once you added/modified/removed a range assignation to an object

property, you have to reload the ontology in OSF to have the modification taken into account by the DVT.

Note: This validation test only supports unique range definition for object properties. This means that you shouldn't define more than one range for a given object property within your ontology.


Errors & Warnings

Errors
OBJECT-PROPERTIES-RANGE-100
Description This error is returned when there is an object property that references a

record that has ha type which doesn't belong to the range defined for that object property has defined in the loaded ontologies

Fields
Field Description
objectProperty The object property URI that raised the validation error
definedRange The range defined for this object property within the loaded ontologies
value The URI of the referenced record which leaded to this error
valueTypes All the types of the referenced record
valueSuperTypes All the super types of the reference records
affectedRecord A list of all the records URI that are affected by this validation error (can be extensive)
Warnings
OBJECT-PROPERTIES-RANGE-50
Description This warning is returned when a property is being used, but for which we don't have any range defined for it in any loaded ontologies. No immediate action is required when this warning is sent, but the error shows areas where the ontologies may be updated/improved.
Fields
Field Description
property The property (datatype or object) URI that raised the validation error


DATATYPE-PROPERTIES-DATATYPE-51
Description This warning is returned when we couldn't get the list super classes for one of the type of the referenced record
Fields
Field Description
property The object property URI that raised the validation error


OBJECT-PROPERTIES-RANGE-52
Description This warning is returned when we couldn't find the ontology where a specific type has been defined
Fields No additional fields


OBJECT-PROPERTIES-RANGE-53
Description This warning is returned when we couldn't get the list of object properties from the OSF instance
Fields No additional fields


OBJECT-PROPERTIES-RANGE-54
Description This warning is returned when we couldn't find the range defined for an object property. This means that the Ontology: Read query failed in some way
Fields No additional fields


OBJECT-PROPERTIES-RANGE-55
Description This warning is returned when we couldn't get the list of affected records by an error that got raised. This means that the SPARQL query failed in some way.
Fields No additional fields

OWL Cardinality

The OWL language enables one to define the number of items a given property may have, known as cardinality. This section describes these DVT tests.

OWL Exact Cardinality Restriction Validation

Introduction

The OWL Exact Cardinality Restriction Validation test is used to check if a given property (datatype or object) is used the exact number of times specified to define a record of a specific type. As we will see below, this validation test can also be used to restrict the usage of a property to describe a record of a specific type.

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the datatype properties which have an exact cardinality restriction defined with them
    1. For each of the exact cardinality restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the records of the type of the restriction uses that exact number of properties
        1. If some record doesn't, then we report a OWL-RESTRICTION-EXACT-100 error for each of them
    2. For each of the exact cardinality restriction, we get the list of all the records of the type of the restriction that are not using that property
      1. If the exact cardinality is not 0, then we report a OWL-RESTRICTION-EXACT-102 error
    3. For each of the exact cardinality restriction, we get the list of all the values related by the class & property defined in the restriction
      1. If the value doesn't belong to the Datatype specified in the restriction, then we report a OWL-RESTRICTION-EXACT-104 error
  2. Get the list of all the object properties which have an exact cardinality restriction defined with them
    1. For each of the exact cardinality restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the records of the type of the restriction uses that exact number of properties
        1. If some record doesn't, then we report a OWL-RESTRICTION-EXACT-101 error for each of them
    2. For each of the exact cardinality restriction, we get the list of all the records of the type of the restriction that are not using that property
      1. If the exact cardinality is not 0, then we report a OWL-RESTRICTION-EXACT-103 error

Technical Explanation

In OWL, there is a concept call Restriction which is used to restrict the association of properties to a particular class extension (a particular set of classes). In other words, this Restriction mechanism is used to state how the properties (object or datatype) should be used, so they should be used to describe what kind of instance records. A restriction has three characteristics:

  1. It is applied to a Class
  2. It specifies which object or datatype property that is being restricted
  3. It specifies the expected type of values:
    1. The class of things that can be referenced by an object property
    2. The datatype that define the values of a datatype property

Let's take an example to illustrate an exact cardinality restriction. In OWL+XML code, we would have a restriction like this:

<owl:Class rdf:about="http://purl.org/ontology/bibo/Document">
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="http://purl.org/ontology/foo#status"/>
            <owl:onClass rdf:resource="http://purl.org/ontology/foo#Status"/>
            <owl:qualifiedCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:qualifiedCardinality>
        </owl:Restriction>
    </rdfs:subClassOf>
</owl:Class>

What this restriction means is:

  • When we describe a record of type bibo:Document, we have to have a foo:status property where its value is a record of type foo:Status. We are restricted to have a single of these triples. If we have none, then we have a validation issue, and if we have more than one, then we have a validation issue as well.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded in OSF have to be properly defined. If no exact cardinality restrictions are defined for any class, then nothing will be tested related to the exact cardinality restrictions. Otherwise, any exact cardinality restriction defined in loaded ontologies will force this test to test all the affected records according to this restriction.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Class". This is the tab where all the classes defined in the ontology will appear. If you click on any of these class that appears on the left side of the application, you will see the class's complete description appearing on the right side of the application.

There is one section that is highlighted on the right side section that is of interest for this test, which is the SubClass Of section. This is where an exact cardinality restriction is defined in Protégé. There are 3 buttons related to such a range that interest us particularly:

  • "+" – The add button is used to add a new sub-class-of of a given class
  • "o" – The edit button is used to edit a sub-class-of of a given class
  • "x" – The remove button is used to delete a sub-class-of of a given class
Exact cardinality restriction.png

To add a new exact cardinality restriction on an object property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Object restriction creator tab. This tab has all the information you need to create a new exact cardinality restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the object property you want to create a restriction for from the left list called Restricted property
  2. Then select the type of records that can be referenced by that property from the Restricted filler right section
  3. Then you have to select the restriction type Exactly (exact cardinality) from the Restriction type section
  4. Finally you have to specify the cardinality number, and click the OK button to save the new restriction
Exact cardinality restriction object.png

To add a new exact cardinality restriction on a datatype property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Data restriction creator tab. This tab has all the information you need to create a new exact cardinality restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the datatype property you want to create a restriction for from the left list called Restricted property
  2. Then select the datatype of that property from the Restricted filler right section
  3. Then you have to select the restriction type Exactly (exact cardinality) from the Restriction type section
  4. Finally you have to specify the cardinality number, and click the OK button to save the new restriction
Exact cardinality restriction datatype.png

One thing that is important to understand is that all the sub-classes of a class where an exact cardinality restriction is defined will inherit that same restriction. This is what you can see in the "SubClass Of (Anonymous Ancestor)" section that we highlighted in yellow in the screenshot above. All three restrictions that appear in that section come from restrictions defined in parent classes of the foo:ExternalPage class. In this specific case, the prefLabel restriction has been defined in the owl:Thing class. The language and status restrictions have been defined in the bibo:Document class.

This inheritance behavior is quite important. That way, you can define an exact cardinality restriction on a class in the upper end of an ontology's class hierarchy and have the cardinality applied to all the sub-classes of that super-class. This means that you don't have to specify that restriction on all and every classes, but only on the parent one(s).

Usage

When you define an exact cardinality restriction on a class, you are trying to accomplish one of the following things:

  1. If the exact cardinality is 0, it means that you want to specify that the usage that property is prohibited on that class. This means that you cannot use that property to describe records of that type. If you define such a 0 cardinality, it means that you want to use the DVT to make sure that people are not using a certain property to define a certain type of records
  2. If the exact cardinality is bigger than 0, it means that you want to specify that every time you describe a record of that type, then you want that number of this property to describe the record

Errors & Warnings

Errors
OWL-RESTRICTION-EXACT-100
Description This error is returned when there is a record of a certain type that is not complying with the exact cardinality restriction for a datatype property as defined in one of the loaded ontologies
Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the exact cardinality restriction
invalidPropertyURI URI of the property which participates into the exact cardinality restriction that raises this validation error
numberOfOccurrences Number of occurrences that this property as been used to describe that record
exactExpectedNumberOfOccurences Expected number of occurrences as defined in the exact cardinality restriction
dataRange The datatype define for the exact cardinality restriction as defined in the loaded ontologies


OWL-RESTRICTION-EXACT-101
Description This error is returned when there is a record of a certain type that is not complying with the exact cardinality restriction for a object property as defined in one of the loaded ontologies
Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the exact cardinality restriction
invalidPropertyURI URI of the property which participates into the exact cardinality restriction that raises this validation error
numberOfOccurrences Number of occurrences that this property has been used to describe that record
exactExpectedNumberOfOccurences Expected number of occurrences as defined in the exact cardinality restriction
classExpression The class expression define for the exact cardinality restriction as defined in the loaded ontologies


OWL-RESTRICTION-EXACT-102
Description This error is returned when there is a record of a certain type that has no property defining it such that it doesn't comply with the exact cardinality restriction greater than 0 for a datatype property as defined in one of the loaded ontologies
Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the exact cardinality restriction
invalidPropertyURI URI of the property which participates into the exact cardinality restriction that raises this validation error
numberOfOccurrences Number of occurrences that this property has been used to describe that record (should be 0)
exactExpectedNumberOfOccurences Expected number of occurrences as defined in the exact cardinality restriction
dataRange The datatype define for the exact cardinality restriction as defined in the loaded ontologies


OWL-RESTRICTION-EXACT-103
Description This error is returned when there is a record of a certain type that has no property defining it with a proper exact cardinality restriction (that is, greater than 0) for an object property as defined in one of the loaded ontologies
Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the exact cardinality restriction
invalidPropertyURI URI of the property which participates into the exact cardinality restriction that raises this validation error
numberOfOccurrences Number of occurrences that this property has been used to describe that record
exactExpectedNumberOfOccurences Expected number of occurrences as defined in the exact cardinality restriction
classExpression The class expression define for the exact cardinality restriction as defined in the loaded ontologies


OWL-RESTRICTION-EXACT-104
Description This error is returned when there is at least one record that uses a datatype property using the exact cardinality restriction and for which there is none that comply with the specified datatype as describe by the restriction.
Fields
Field Description
datatypeProperty URI of the datatype property that has the restriction
expectedDatatype URI of the expected datatype
invalidValue Value that is not complying with the datatype defined in the restriction
affectedRecord Record affected with this validation check


Warnings
OWL-RESTRICTION-EXACT-50
Description This warning is returned when e couldn't get the cardinality restriction on the datatype property from the OSF instance. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-EXACT-51
Description This warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-EXACT-52
Description This warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-EXACT-53
Description This warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-EXACT-54
Description This warning is returned when e couldn't get the cardinality restriction on the object property from the OSF instance. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-EXACT-55
Description This warning is returned when we couldn't get the list of values for a given datatype property
Fields No additional fields specified

OWL Maximum Cardinality Restriction Validation

The OWL Maximum Cardinality Restriction Validation test is used to check if a given property (datatype or object) is used the maximum number of times to define a record of a specific type.

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the datatype properties which have a a maximum cardinality restriction defined with them
    1. For each of the maximum cardinality restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the records of the type of the restriction uses that maximum number of properties
        1. If some record doesn't, then we report a OWL-RESTRICTION-MAX-100 error for each of them
    2. For each of the maximum cardinality restriction, we get the list of all the values related by the class & property defined in the restriction
      1. If the value doesn't belong to the Datatype specified in the restriction, then we report a OWL-RESTRICTION-MAX-102 error
  2. Get the list of all the object properties which have an exact cardinality restriction defined with them
    1. For each of the exact cardinality restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the records of the type of the restriction uses that exact number of properties
        1. If some record doesn't, then we report a OWL-RESTRICTION-MAX-101 error for each of them

Technical Explanation

In OWL, there is a concept called Restriction which is used to restrict the association of properties to a particular class extension (a particular set of classes). In other words, this Restriction mechanism is used to state how the properties (object or datatype) should be used, so they should be used to describe what kind of instance records. A restriction has three characteristics:

  1. It is applied to a Class
  2. It specifies which object or datatype property that is being restricted
  3. It specifies the expected type of values:
    1. The class of things that can be referenced by an object property
    2. The datatype that define the values of a datatype property

Let's take an example to illustrate a maximum cardinality restriction. In OWL+XML code, we would have a restriction like this:

<owl:Class rdf:about="http://purl.org/ontology/bibo/Document">
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="&dcterms;language"/>
            <owl:maxQualifiedCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:maxQualifiedCardinality>
            <owl:onDataRange rdf:resource="&xsd;language"/>
        </owl:Restriction>
    </rdfs:subClassOf>
</owl:Class>

What this restriction means is:

  • When we describe a record of type bibo:Document, we have to have a dcterms:language property where its value is a record of datatype xsd:language. We are restricted to have a maximum of one of these triple.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded in OSF have to be properly defined. If no maximum cardinality restrictions are defined for any class, then nothing will be tested related to the maximum cardinality restrictions. Otherwise, any maximum cardinality restriction defined in loaded ontologies will force this test to test all the affected records according to this restriction.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Class". This is the tab where all the classes defined in the ontology will appear. If you click on any of these class that appears on the left side of the application, you will see the class's complete description appearing on the right side of the application.

There is one section that is highlighted on the right side section that is of interest for this test, which is the SubClass Of section. This is where an maximum cardinality restriction is defined in Protégé. There are 3 buttons related to such a range that interest us particularly:

  • "+" – The add button is used to add a new sub-class-of of a given class
  • "o" – The edit button is used to edit a sub-class-of of a given class
  • "x" – The remove button is used to delete a sub-class-of of a given class
Max cardinality restriction.png

To add a new maximum cardinality restriction on an object property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Object restriction creator tab. This tab has all the information you need to create a new maximum cardinality restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the object property you want to create a restriction for from the left list called Restricted property
  2. Then select the type of records that can be referenced by that property from the Restricted filler right section
  3. Then you have to select the restriction type Max (max cardinality) from the Restriction type section
  4. Finally you have to specify the cardinality number, and click the OK button to save the new restriction
Max cardinality restriction object.png

To add a new maximum cardinality restriction on a datatype property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Data restriction creator tab. This tab has all the information you need to create a new maximum cardinality restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the datatype property you want to create a restriction for from the left list called Restricted property
  2. Then select the datatype of that property from the Restricted filler right section
  3. Then you have to select the restriction type Max (max cardinality) from the Restriction type section
  4. Finally you have to specify the cardinality number, and click the OK button to save the new restriction
Max cardinality restriction datatype.png

One thing that is important to understand is that all the sub-classes of a class where an maximum cardinality restriction is defined will inherit that same restriction. This is what you can see in the "SubClass Of (Anonymous Ancestor)" section that we highlighted in yellow in the screenshot above. All the three restrictions that appears in that section comes from restrictions defined in parent classes of the foo:ExternalPage class. In this specific case, the prefLabel restriction has been defined in the owl:Thing class. The language and status restrictions have been defined in the bibo:Document class.

This inheritance behavior is quite important. That way, you can define an maximum cardinality restriction on a class in the upper end of an ontology's class hierarchy and have the cardinality applied to all the sub-classes of that super-class. This means that you don't have to specify that restriction on all and every classes, but only on the parent one(s).

Usage

When you define a maximum cardinality restriction on a class, specify that if a property (the one related to the restriction) is being used to describe a record of a certain type (the one related to the restriction) then you can use it a maximum of time has specified by the maximum cardinality restriction on that property.

Errors & Warnings

Errors
OWL-RESTRICTION-MAX-100
Description This error is returned when there is a record of a certain type that is not complying with the maximum cardinality restriction for a datatype property has defined in one of the loaded ontologies
Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the maximum cardinality restriction
invalidPropertyURI URI of the property which participates into the maximum cardinality restriction that raises this validation error
numberOfOccurrences Number of occurrences that this property has been used to describe that record
maximumExpectedNumberOfOccurrences Expected number of occurrences as defined in the maximum cardinality restriction
dataRange The datatype define for the exact cardinality restriction as defined in the loaded ontologies


OWL-RESTRICTION-MAX-101
Description This error is returned when there is a record of a certain type that is not complying with the maximum cardinality restriction for a object property as defined in one of the loaded ontologies
Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the maximum cardinality restriction
invalidPropertyURI URI of the property which participates into the maximum cardinality restriction that raises this validation error
numberOfOccurrences Number of occurrences that this property has been used to describe that record
maximumExpectedNumberOfOccurrences Expected number of occurrences as defined in the maximum cardinality restriction
classExpression The class expression define for the exact cardinality restriction as defined in the loaded ontologies


OWL-RESTRICTION-MAX-102
Description This error is returned when there is at least one record that uses a datatype property using the maximum cardinality restriction and for which there is none that comply with the specified datatype as describe by the restriction.
Fields
Field Description
datatypeProperty URI of the datatype property that has the restriction
expectedDatatype URI of the expected datatype
invalidValue Value that is not complying with the datatype defined in the restriction
affectedRecord Record affected with this validation check
Warnings
OWL-RESTRICTION-MAX-50
Description This warning is returned when e couldn't get the cardinality restriction on the datatype property from the OSF instance. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-MAX-51
Description This warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-MAX-52
Description This warning is returned when e couldn't get the cardinality restriction on the object property from the OSF instance. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-MAX-53
Description We couldn't get sub-classes of class expression from the OSF instance
Fields No additional fields specified


OWL-RESTRICTION-MAX-54
Description This warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way
Fields No additional fields specified


OWL-RESTRICTION-MAX-55
Description This warning is returned when we couldn't get the list of values for a given datatype property
Fields No additional fields specified

OWL Existential Restriction Validation

The OWL Existential Restriction Validation test is used to check if a given datatype property is used to describe a record of a certain type, then that at least one of its value is using the datatype defined in the restriction. It can also be used to check if a given object property is used to describe a record of a certain type, then that at least one of its value is referencing a record of the type defined in the restriction.

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the datatype properties which have an existential restriction defined for them
    1. For each of the existential restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the records of the type of the restriction, that uses that property, has at least one value of the defined datatype
        1. If some record doesn't, then we report a OWL-RESTRICTION-SOME-100 error for each of them
    2. For each of the exact cardinality restriction, we get the list of all the values related by the class & property defined in the restriction
      1. If the value doesn't belong to the Datatype specified in the restriction, then we report a OWL-RESTRICTION-SOME-102 error
  2. Get the list of all the object properties which have an existential restriction defined for them
    1. For each of the existential restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the records of the type of the restriction, that uses that property, has at least one value of the defined type
        1. If some record doesn't, then we report a OWL-RESTRICTION-SOME-101 error for each of them

Technical Explanation

In OWL, there is a concept call Restriction which is used to restrict the association of properties to a particular class extension (a particular set of classes). In other words, this Restriction mechanism is used to state how the properties (object or datatype) should be used, so they should be used to describe what kind of instance records. A restriction has three characteristics:

  1. It is applied to a Class
  2. It specifies which object or datatype property that is being restricted
  3. It specifies the expected type of values:
    1. The class of things that can be referenced by an object property
    2. The datatype that define the values of a datatype property

Let's take an example to illustrate an existential restriction. In OWL+XML code, we would have a restriction like this:

<owl:Class rdf:about="http://purl.org/ontology/bibo/Document">
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="&dcterms;subject"/>
            <owl:someValuesFrom rdf:resource="&owl;Class"/>
        </owl:Restriction>
    </rdfs:subClassOf>
</owl:Class>

What this restriction means is:

  • When we describe a record of type bibo:Document, we have to have a dcterms:subject property where its value is a record of type owl:Class. If a record of type bibo:Document is described using the dcterms:subject property, then we are restricted to have a minimum of one of the values of this property to be of type owl:Class.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded in OSF have to be properly defined. If no existential restrictions are defined for any class, then nothing will be tested related to the existential restrictions. Otherwise, any existential restriction defined in loaded ontologies will force this test to test all the affected records according to this restriction.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Class". This is the tab where all the classes defined in the ontology will appear. If you click on any of these class that appears on the left side of the application, you will see the class's complete description appearing on the right side of the application.

There is one section that is highlighted on the right side section that is of interest for this test, which is the SubClass Of section. This is where an existential restriction is defined in Protégé. There are 3 buttons related to such a range that interest us particularly:

  • "+" – The add button is used to add a new sub-class-of of a given class
  • "o" – The edit button is used to edit a sub-class-of of a given class
  • "x" – The remove button is used to delete a sub-class-of of a given class
Existential restriction.png

To add a new existential restriction on an object property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Object restriction creator tab. This tab has all the information you need to create a new existential restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the object property you want to create a restriction for from the left list called Restricted property
  2. Then select the type of records that can be referenced by that property from the Restricted filler right section
  3. Then you have to select the restriction type Some (existential) from the Restriction type section
  4. Finally you have to specify the cardinality number, and click the OK button to save the new restriction
Existential restriction object.png

To add a new existential restriction on a datatype property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Data restriction creator tab. This tab has all the information you need to create a new existential restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the datatype property you want to create a restriction for from the left list called Restricted property
  2. Then select the datatype of that property from the Restricted filler right section
  3. Then you have to select the restriction type Some (existential) from the Restriction type section
Existential restriction datatype.png

One thing that is important to understand is that all the sub-classes of a class where an existential restriction is defined will inherit that same restriction. This is what you can see in the "SubClass Of (Anonymous Ancestor)" section that we highlighted in yellow in the screenshot above. All the three restrictions that appears in that section comes from restrictions defined in parent classes of the foo:ExternalPage class. In this specific case, the prefLabel restriction has been defined in the owl:Thing class. The language and status restrictions have been defined in the bibo:Document class.

This inheritance behavior is quite important. That way, you can define an existential restriction on a class in the upper end of an ontology's class hierarchy and have the cardinality applied to all the sub-classes of that super-class. This means that you don't have to specify that restriction on all and every classes, but only on the parent one(s).

Usage

When you define an existential restriction on a class, you are specifying that if a certain property is used to describe a record of that type, then you have to have at least one of its value to be the restricted value by the restriction.

Errors & Warnings

Errors

OWL-RESTRICTION-SOME-100
Description This

error is returned when there is a record of a certain type that is not complying with the existential restriction for a datatype property as defined in one of the loaded ontologies

Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the existential restriction
invalidPropertyURI URI of the property which participates into the existential restriction that raises this validation error
dataRange The datatype define for the existential restriction as defined in the loaded ontologies


OWL-RESTRICTION-SOME-101
Description This

error is returned when there is a record of a certain type that is not complying with the existential restriction for a object property as defined in one of the loaded ontologies

Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the existential restriction
invalidPropertyURI URI of the property which participates into the existential restriction that raises this validation error
classExpression The class expression define for the existential restriction as defined in the loaded ontologies


OWL-RESTRICTION-SOME-102
Description This

error is returned when there is at least one record that uses a datatype property using the existential restriction and for which there is none that comply with the specified datatype as describe by the restriction.

Fields
Field Description
datatypeProperty URI of the datatype property that has the restriction
expectedDatatype URI of the expected datatype

Warnings

OWL-RESTRICTION-SOME-50
Description This

warning is returned when we couldn't get the list of existential restrictions from the OSF instance. This means that the SPARQL query failed in some way

Fields No additional fields specified


OWL-RESTRICTION-SOME-51
Description This

warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way

Fields No additional fields specified


OWL-RESTRICTION-SOME-52
Description This

warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way

Fields No additional fields specified


OWL-RESTRICTION-SOME-53
Description This warning is returned when we couldn't get sub-classes of class expression from the OSF instance
Fields No additional fields specified


OWL-RESTRICTION-SOME-54
Description This warning is returned when we couldn't get sub-classes of class expression from the OSF instance
Fields No additional fields specified


OWL-RESTRICTION-SOME-55
Description This warning is returned when we couldn't get the list of values for a given datatype property
Fields No additional fields specified

OWL Universal Restriction Validation

The OWL Universal Restriction Validation test is used to check if a given datatype property is used to describe a record of a certain type, then all its value have to use the datatype defined in the restriction. It can also be used to check if a given object property is used to describe a record of a certain type, then all its values have to be of the type defined in the restriction.

How it Works

The heuristic used by this check is as follows:

  1. Get the list of all the datatype properties which have a universal restriction defined for them
    1. For each of the universal restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the values of all the records are of the datatype defined in the description
        1. If some record doesn't, then we report a OWL-RESTRICTION-ONLY-100 error for each of them
    2. For each of the exact cardinality restriction, we get the list of all the values related by the class & property defined in the restriction
      1. If there is not at least one value that belong to the Datatype specified in the restriction, then we report a OWL-RESTRICTION-ONLY-102 error
  2. Get the list of all the object properties which have a universal restriction defined for them
    1. For each of the universal restriction, we get the list of all the records that uses that property
      1. For each of them, we make sure that all the values of all the records are of the type described in the universal restriction
        1. If some record doesn't, then we report a OWL-RESTRICTION-ONLY-101 error for each of them

Technical Explanation

In OWL, there is a concept call Restriction which is used to restrict the association of properties to a particular class extension (a particular set of classes). In other words, this Restriction mechanism is used to state how the properties (object or datatype) should be used, so they should be used to describe what kind of instance records. A restriction has three characteristics:

  1. It is applied to a Class
  2. It specifies which object or datatype property that is being restricted
  3. It specifies the expected type of values:
    1. The class of things that can be referenced by an object property
    2. The datatype that define the values of a datatype property

Let's take an example to illustrate an existential restriction. In OWL+XML code, we would have a restriction like this:

<owl:Class rdf:about="http://purl.org/ontology/bibo/Document">
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="&dcterms;subject"/>
            <owl:allValuesFrom rdf:resource="&owl;Class"/>
        </owl:Restriction>
    </rdfs:subClassOf>
</owl:Class>

What this restriction means is:

  • When we describe a record of type bibo:Document, we have to have a dcterms:subject property where its value is a record of type owl:Class. If a record of type bibo:Document is described using the dcterms:subject property, then we are restricted to have all the values to be of type of type owl:Class.

Specifying within an Ontology

For this data validation test to work, the ontologies loaded in OSF have to be properly defined. If no universal restrictions are defined for any class, then nothing will be tested related to the universal restrictions. Otherwise, any universal restriction defined in loaded ontologies will force this test to test all the affected records according to this restriction.

When you edit an ontology into Protégé, you have a series of tabs. One of which is called "Class". This is the tab where all the classes defined in the ontology will appear. If you click on any of these class that appears on the left side of the application, you will see the class's complete description appearing on the right side of the application.

There is one section that is highlighted on the right side section that is of interest for this test, which is the SubClass Of section. This is where an existential restriction is defined in Protégé. There are 3 buttons related to such a range that interest us particularly:

  • "+" – The add button is used to add a new sub-class-of of a given class
  • "o" – The edit button is used to edit a sub-class-of of a given class
  • "x" – The remove button is used to delete a sub-class-of of a given class
Universal restriction.png

To add a new universal restriction on an object property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Object restriction creator tab. This tab has all the information you need to create a new universal restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the object property you want to create a restriction for from the left list called Restricted property
  2. Then select the type of records that can be referenced by that property from the Restricted filler right section
  3. Then you have to select the restriction type Only (universal) from the Restriction type section
Universal restriction object.png

To add a new universal restriction on a datatype property, you have to click the "+" button at the right of the SubClass Of section. Clicking this button, a new contextual window will appear. From that window, you will have to select the Data restriction creator tab. This tab has all the information you need to create a new universal restriction for the selected class (in this case, it is the foo:ExternalPage class).

What you have to do once you clicked that tab is:

  1. Select the datatype property you want to create a restriction for from the left list called Restricted property
  2. Then select the datatype of that property from the Restricted filler right section
  3. Then you have to select the restriction type Only (universal) from the Restriction type section
Universal restriction object.png

One thing that is important to understand is that all the sub-classes of a class where a universal restriction is defined will inherit that same restriction. This is what you can see in the "SubClass Of (Anonymous Ancestor)" section that we highlighted in yellow in the screenshot above. All the three restrictions that appears in that section comes from restrictions defined in parent classes of the foo:ExternalPage class. In this specific case, the prefLabel restriction has been defined in the owl:Thing class. The language and status restrictions have been defined in the bibo:Document class.

This inheritance behavior is quite important. That way, you can define universal restriction on a class in the upper end of an ontology's class hierarchy and have the cardinality applied to all the sub-classes of that super-class. This means that you don't have to specify that restriction on all and every classes, but only on the parent one(s).

Usage

When you define a universal restriction on a class, you are specifying that if a certain property is used to describe a record of that type, then you have to have all its value to be the restricted value by the restriction.

Errors & Warnings

Errors

OWL-RESTRICTION-ONLY-100
Description This

error is returned when there is a record of a certain type that is not complying with the universal restriction for a datatype property as defined in one of the loaded ontologies

Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the universal restriction
invalidPropertyURI URI of the property which participates into the universal restriction that raises this validation error
dataRange The datatype define for the universal restriction as defined in the loaded ontologies


OWL-RESTRICTION-ONLY-101
Description This

error is returned when there is a record of a certain type that is not complying with the universal restriction for a object property as defined in one of the loaded ontologies

Fields
Field Description
invalidRecordURI URI of the instance record which is not properly defined according to the universal restriction
invalidPropertyURI URI of the property which participates into the universal restriction that raises this validation error
classExpression The class expression define for the universal restriction as defined in the loaded ontologies


OWL-RESTRICTION-ONLY-102
Description This

error is returned when there is at least one record that uses a datatype property using the existential restriction and for which there is none that comply with the specified datatype as describe by the restriction.

Fields
Field Description
datatypeProperty URI of the datatype property that has the restriction
expectedDatatype URI of the expected datatype
invalidValue Value that is not complying with the datatype defined in the restriction
affectedRecord Record affected with this validation check

Warnings

OWL-RESTRICTION-ONLY-50
Description This

warning is returned when we couldn't get the list of universal restrictions from the OSF instance. This means that the SPARQL query failed in some way

Fields No additional fields specified


OWL-RESTRICTION-ONLY-51
Description This

warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way

Fields No additional fields specified


OWL-RESTRICTION-ONLY-52
Description This

warning is returned when we couldn't get the number of properties, per record, that have been indexed in the triple store. This means that the SPARQL query failed in some way

Fields No additional fields specified


OWL-RESTRICTION-ONLY-53
Description This warning is returned when we couldn't get sub-classes of class expression from the OSF instance
Fields No additional fields specified


OWL-RESTRICTION-ONLY-54
Description This warning is returned when we couldn't get the list of values for a given datatype property
Fields No additional fields specified

Automating Data Validation

One way to use the Data Validator Tool (DVT) is to be able to automatically run these data validation tests, and then to act upon possible issues. The DVT has been developed with a few features to make such automated testing easier. This section outlines the series of features that have been developed for this purpose, and how they should be used.

Silent Mode

The silent mode should be used by any automatic data validation software. What this feature does is to stop sending tests results to the shell terminal. It mutes all the outputs to the user. To enable this feature, you simply have to append -s to the DVT command. It can be done this way:

# Silent the output to the terminal
dvt -v -s --output-xml="/tmp/dvt_validation_checks_log.xml"

Validation Tests Logs

All the tests can be logging into a XML and/or a JSON serialized log file. These log files can easily be used by an automated data validation software. Once the DVT command finished, then they will have access to read one of the log files, and then analyze them as required, and take actions depending on what they found within the log file (like sending emails for reviewing, automatically fixing issues, etc.).

XML Logs Files

The logs files can easily be saved on the file system using a log file serialized in XML. Such a log file can be saved using the --output-xml="[PATH]" command:

dvt -v -s --output-xml="/tmp/dvt_validation_checks_log.xml"

What this command does is to save all the output, in XML, into the dvt_validation_checks_log.xml located in the /tmp/ folder.Here is an example of such a (partial) XML log file:

<?xml version="1.0"?>
<checks>
  <check>
    <name>URI Usage Existence Check</name>
    <description>Make sure that all the referenced URIs exists in one of the input dataset or ontology</description>
    <onDatasets>
      <dataset>http://localhost/datasets/documents/</dataset>
      <dataset>http://localhost/datasets/partners/</dataset>
    </onDatasets>
    <usingOntologies>
      <ontology>file://localhost/data/ontologies/files/iron.owl</ontology>
    </usingOntologies>
    <validationWarnings>
    </validationWarnings>
    <validationErrors>
      <error>
        <id>URI-EXISTANCE-100</id>
        <unexistingURI>http://www.w3.org/2002/07/owl#AnnotationProperty</unexistingURI>
      </error>
      <error>
        <id>URI-EXISTANCE-100</id>
        <unexistingURI>http://www.w3.org/2002/07/owl#Restriction</unexistingURI>
      </error>
    </validationWarnings>
    <fixes>
      <fix>
        <dataset>http://localhost/datasets/documents/</dataset>
        <subject>http://localhost/datasets/documents/26765</subject>
        <predicate>http://purl.org/dc/terms/subject</predicate>
        <object>http://purl.org/ontology/foo#incident_monitoring</object>
      </fix>
  </check>
</checks>

JSON Logs Files

The logs files can easily be saved on the file system using a log file serialized in JSON. Such a log file can be saved using the --output-json="[PATH]" command:

dvt -v -s --output-json="/tmp/dvt_validation_checks_log.json"

What this command does is to save all the output, in JSON, into the dvt_validation_checks_log.json located in the /tmp/ folder. Here is an example of such a (partial) JSON log file:

{
    "checks": [
        {
            "name": "URI Usage Existence Check",
            "description": "Make sure that all the referenced URIs exists in one of the input dataset or ontology",
            "onDatasets": [
                "http://localhost/datasets/documents/",
                "http://localhost/datasets/partners/"
            ],
            "usingOntologies": [
                "file://localhost/data/ontologies/files/iron.owl"
            ],
            "validationWarnings": [],
            "validationErrors": [
                {
                    "id": "URI-EXISTANCE-100",
                    "unexistingURI": "http://www.w3.org/2002/07/owl#AnnotationProperty"
                },
                {
                    "id": "URI-EXISTANCE-100",
                    "unexistingURI": "http://www.w3.org/2002/07/owl#Restriction"
                }
            ],
 
            "fixes": [
            {
              "dataset": "http://localhost/datasets/documents/",
              "subject": "http://localhost/datasets/documents/18106",
              "predicate": "http://purl.org/dc/terms/subject",
              "object": "http://purl.org/ontology/foo#incident_monitoring"
            }
        }
    ]
}

Creating New Datatypes

Real World validation usecases does have all kind of diferent specific requirements. This means that most of the time, it is not enough to use the default constructs that exists in the different data model. This is no different with the Data Validator Tool nor with OWL and the XSD datasets.

In this section, we will describe how we can create new XSD Datatypes, in an OWL ontology, which could be used by the DVT to validate more complex values of the datatype properties.

Overview

One of the core concept of RDF and OWL ontologies is the Datatype. Technically speaking, a Datatype is nothing else than a set of possible values. Let's take an example to shows what that means. Let's say we have a datatype property iron:prefLabel which is specified to have a datatype xsd:Label as its value space. Now, let's say that we define the xsd:Label datatype to be the datatype that includes all the possible labels that have a length of 4 characters and more. What this means is that you cannot have a value, for the iron:prefLabel that as fewer characters than 4. If you do, then you will end-up with a validation error.

This is what the datatypes are used for: to make sure that the values of the datatype properties do comply with certain defined rules. There exists a series of core, pre-existing, XSD Datatypes which are:

  • xsd:dateTime
  • xsd:base64Binary
  • xsd:unsignedInt
  • xsd:dateTimeStamp
  • xsd:anyURI
  • xsd:boolean
  • xsd:byte
  • xsd:unsignedByte
  • xsd:decimal
  • xsd:double
  • xsd:float
  • xsd:int
  • xsd:integer
  • xsd:nonNegativeInteger
  • xsd:nonPositiveInteger
  • xsd:positiveInteger
  • xsd:negativeInteger
  • xsd:short
  • xsd:unsignedShort
  • xsd:long
  • xsd:unsignedLong
  • xsd:hexBinary
  • xsd:language
  • xsd:Name
  • xsd:NCName
  • xsd:NMTOKEN
  • xsd:string
  • xsd:token
  • xsd:normalizedString
  • rdf:XMLLiteral
  • rdf:PlainLiteral

However, as we will see below, we can easily create new custom Datatypes that will define a space of possible values.

Creating a new Datatype using Protégé

This section explains how new Datatypes can be created using Protégé. We will see how we can create custom Datatypes that will be use to specify more complex value spaces for the datatype properties that we defined in our ontology. Then these new Datatypes will be taken into account by the DVT for validating the indexed content of all the specified datasets.

Introduction to the XSP Ontology

New datatypes can be created using the RDFS ontology. However, the problem is that there is no properties in the OWL ontology to describe more specific characteristics of the included values within a Datatype like their maximum/minimum number of characters, their maximum/minimum numeric value, if they match a regular expression pattern, etc. This is for that reason that we have to rely on the semantic of an external ontology for describing these kind of Datatype characteristic. This ontology is called the XSP ontology.

This ontology is composed of some basic properties, all of which are known and handled by the DVT:

XSP Property Description
xsp:base Specifies the base XSD datatype of this custom Datatype. The custom Datatype will inherit the characteristics of its parent Datatype.
xsp:length If the value is a string, specifies the exact length (number of character) the values should have
xsp:maxExclusive If the value is a number, specifies the maximum, exclusive, value of that number ( < )
xsp:maxInclusive If the value is a number, specifies the maximum, inclusive, value of that number ( <= )
xsp:maxLength If the value is a string, specifies the maximum length (number of character) the values should have (inclusive)
xsp:minExclusive If the value is a number, specifies the minimum, exclusive, value of that number ( > )
xsp:minInclusive If the value is a number, specifies the minimum, inclusive, value of that number ( >= )
xsp:minLength If the value is a string, specifies the minimum length (number of character) the values should have (inclusive)
xsp:pattern Specifies a regular pattern that should match the value. Note: if you are using a backslash "\" for escaping characters of the regular expression, you have to double escape it like this: "\\" otherwise the regular expression you are defining in the ontology won't be the same as the one you are expecting.

Creating a new Datatype

Now what we want to do is to define new Datatypes using the RDFS and the XSP ontologies. First of all, let's consider the following XML entities that will be used in the examples below:

<!DOCTYPE rdf:RDF [
   <!ENTITY owl "http://www.w3.org/2002/07/owl#" >
    <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#" >
    <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#" >
    <!ENTITY xsp "http://www.owl-ontologies.com/2005/08/07/xsp.owl#" >
]>

Now let's start with a simple example:

<rdfs:Datatype rdf:about="&xsd;DocumentDescription">
  <xsp:minLength rdf:datatype="&xsd;int">20</xsp:minLength>
  <xsp:base rdf:resource="&xsd;string"/>
</rdfs:Datatype>

What we created here is a new Datatype called xsd:DocumentDescription. This new Datatype is based on the core xsd:string Datatype. However, what we specify is that all the values that belong to that Datatype does require a minimum number of characters of 20. All others values won't be valid values for that xsd:DocumentDescription Datatype.

Important note: if you are using a backslash "\" for escaping characters of the regular expression, you have to double escape it like this: "\\" otherwise the regular expression you are defining in the ontology won't be the same as the one you are expecting.


Now let's try to define a new Datatype which includes all the Canadian postal codes that may exists (with or without spaces). Do create such a complex Datatype, what we do is to define a regular expression that will validate the values for us:

<rdfs:Datatype rdf:about="&xsd;DocumentDescription">
  <xsp:pattern rdf:datatype="&xsd;string">[ABCEGHJKLMNPRSTVXY]{1}\\d{1}[A-Z]{1} *\\d{1}[A-Z]{1}\\d{1}</xsp:minLength>
  <xsp:base rdf:resource="&xsd;string"/>
</rdfs:Datatype>

As you can see, it can see, new Datatype can easily be created as required. These same Datatype will be used by the DVT to validate all the content of your datasets, as required.

Using a Custom Datatype

The final step is to know how the Datatypes can be created, managed and used in the Protégé OWL ontologies editor.

Exposing Datatypes in the Protégé User Interface

By default Protégé doesn't show the datatypes that are defined in an ontology. What we have to do is to expose the Datatype panels required to add new, and modify existing, Datatypes in an ontology.

The first thing to do is to click the "Window -> Views -> Datatype views" top menu item:

Protege datatype views.png
What you have to expose here are two different views:
  • Datatypes
  • Annotations

You have to select these two views. Once you clicked these menu items, Protégé will ask you to put these new panels somewhere in the user interface. Once you are done, your user interface should looks like:

Protege datatype views exposed.png
Listing all Existing Datatypes

All the Datatypes that are defined in an ontology will appear in the Datatype views that you added to your Protégé user interface. All the core Datatypes, and all the Custom Datatypes you may have created will be listed in that list of Datatypes:

Protege datatype views exposed datatype.png

If you select a Datatype in that list, you will see a series of annotations to that datatype appearing in the Datatype Annotations view that you exposed in your Protégé user interface.

Creating a New Datatype

To create a new Datatype, you have to click the "+" sign in the Datatype view:

Create new datatype.png
Once you clicked on that button, you will be prompted with a dialog that will ask you to define the URI of the new Datatype:
Create new datatype dialog.png

Once you clicked "OK", you will create the new Datatype and it will appear in the list of available Datatypes.

Modifying Existing Datatypes

To modify a Datatype, you first have to select it in the list of available Datatypes. Then you have to create new annotation properties to describe the Datatype.

First, click on the "+" button of the selected Datatype:

Update datatype.png
Once you clicked that button, you will see the Annotations dialog box appearing. From that dialog box, you will have to select the XSP property you one to define. Then you will have to put its value and click "OK":
Update datatype dialog.png
Once you clicked the "OK" button, you will see the new annotation appearing for that selected Datatype. From there, you can add more properties as required. Using the same Datatype Annotations view, you can edit existing annotations statements, delete them, etc.

Automated Tests Operations

Automated testing software that uses the DVT to validate indexed data may want to react depending on what is written into these log files. Different kinds of operation may be implemented depending if an error or a warning gets written into the log. Here is a list of operations that may be implemented to react on the validation reports:

  1. Depending on the warning ID, emails can be sent to system maintainers and/or developers
  2. Depending on the error ID, scripts that would try to automatically fix the issue may be executed
  3. Depending on the error ID, emails can be sent to ontologies/data maintainers to validate the issue and to fix with internal procedures.

Note: these operations are NOT part of the functional capabilities of the DVT. Separate scripts would need to be applied against the logs to achieve the automated suggestions above.