Data Validator Tool: Validating Dataset Content Based On Ontology Descriptions

Introduction
The Data Validator Tool (DVT) is a command line tool that is used to validate data records indexed in OSF datasets according to the description of loaded ontologies. Depending on how the ontologies are described, the DVT will validate the content of the datasets and report possible issues. The DVT is a post-indexation validation mechanism. It doesn't enforce any data validation at indexation time. It does report validation issues when the DVT is run against OSF. Once validation errors are detected, different mechanisms have to be put in place to fix these issues.

This document explains how the DVT should be used. It also explains how the current data validation tests works and how the reported errors should be interpreted. It also explains how the ontologies should be described, using the Protégé ontology editor, to better specify the ontologies in order to take full advantage of the DVT validation tests.

Installation & Configuration
All the installation & configuration steps are directly available on the Data Validator Tool page.

Command Line
Using the DVT command line tool is pretty easy. Its command line options and parameters are: Let's take a deeper look into each of these parameters. Note that any parameter can be used with any other parameter. Here are a few command line examples for using the DVT:
 * 1) * If you specify this parameter, then you will start the validation process. If you don't specify it, then no validation will be performed by the DVT
 * 2) * If you specify this parameter, then nothing will be outputted to the shell terminal. This is usually used when an external tool performed automated validation using the DVT
 * 3) * If you specify this parameter, then you are asking each of the validation test to try to automatically fix the tests that failed. This option is not supported by all validation checks, so only the ones that support that option will try to fix the validation issues.
 * 4) * If you specify this parameter, then this help will be output to the shell terminal
 * 5) * If you specify this parameter, then all the tests, warnings and errors will be written into a XML file, as specified by the  value. Make sure that the user that runs the DVT do have write permission on the specified  . This is normally used to log validation tests
 * 6) * If you specify this parameter, then all the tests, warnings and errors will be written into a JSON file, as specified by the  value. Make sure that the user that runs the DVT do have write permission on the specified  . This is normally used to log validation tests
 * 7) * If you specify this parameter, then the amount of memory specified will be used by the DVT to run the tests. Depending on the size of the datasets and the tests defined within the ontologies, more memory may be required by the DVT to work normally
 * 1) * If you specify this parameter, then this help will be output to the shell terminal
 * 2) * If you specify this parameter, then all the tests, warnings and errors will be written into a XML file, as specified by the  value. Make sure that the user that runs the DVT do have write permission on the specified  . This is normally used to log validation tests
 * 3) * If you specify this parameter, then all the tests, warnings and errors will be written into a JSON file, as specified by the  value. Make sure that the user that runs the DVT do have write permission on the specified  . This is normally used to log validation tests
 * 4) * If you specify this parameter, then the amount of memory specified will be used by the DVT to run the tests. Depending on the size of the datasets and the tests defined within the ontologies, more memory may be required by the DVT to work normally
 * 1) * If you specify this parameter, then all the tests, warnings and errors will be written into a JSON file, as specified by the  value. Make sure that the user that runs the DVT do have write permission on the specified  . This is normally used to log validation tests
 * 2) * If you specify this parameter, then the amount of memory specified will be used by the DVT to run the tests. Depending on the size of the datasets and the tests defined within the ontologies, more memory may be required by the DVT to work normally
 * 1) * If you specify this parameter, then the amount of memory specified will be used by the DVT to run the tests. Depending on the size of the datasets and the tests defined within the ontologies, more memory may be required by the DVT to work normally