Role and Use of Ontologies in OSF

Ontologies are the central logic and controlling structures in the open semantic framework (OSF).

At an introductory level, there are two sets of summary documents regarding ontologies on this wiki. One set of documents provides an executive summary to ontologies and their basic role in knowledge systems. A second set of documents pertains to how ontologies are used in an OSF installation. One document in that set -- the present document herein -- overviews the specific ways in which ontologies drive and interact with an OSF instance. A complementary document in that set describes the overall ontology modularization and architecture within an OSF installation. Both of these documents are in turn reinforced by an ontology best practices document.

Introduction
The Open Semantic Framework's core controlling structures are ontologies. There are a variety of ontologies with different purposes in an OSF instance.

Constituent Ontologies
An OSF installation may typically utilize most or all of the following internal ontologies:


 * 1) The SCO Ontology (Semantic Component Ontology)
 * 2) The WSF Ontology (Web Service Framework Ontology)
 * 3) The AGGR Ontology (Aggregation Ontology)
 * 4) The irON Ontology (Instance Record an Object Notation Ontology)
 * 5) One or more domain ontologies, to capture the concepts and relationships for the purposes of a given OSF installation, and
 * 6) Possibly UMBEL or other upper-level concept ontologies, used for linkages to external systems.

(Note: the internal wiki links to each of these ontologies also provides links to the actual ontology specifications on Github.)

A useful discussion of these ontologies and their interactions in an OSF instance is provided by the ontology modularization document. This current document focuses primarily on the specific properties and roles associated with them in an OSF installation.

Depending on the specific OSF installation, of course, multiple external ontologies may also be employed. Some of the common external ones used in an OSF installation are described by the external ontologies document. These external ontologies are important -- indeed essential in order to ensure linkage to the external world -- but have little to do with internal OSF control structures. That is why the rest of this discussion focused on internal ontologies only.

Summary Ontology Roles
Ontologies play pivotal roles across all parts of the framework. In a broad sense, the internal OSF ontologies are used for annotations, guiding interactions or relating concepts and information to other information. In specific terms, OSF ontologies may play one or more of these dozen or so roles:


 * 1) Define record descriptions
 * 2) Inform interface displays
 * 3) Integrate different data sources
 * 4) Define component selections
 * 5) Define component behaviors
 * 6) Guide template selection
 * 7) Provide reasoning and inference
 * 8) Guide content filtering (with and without inference)
 * 9) Tag concepts in text documents
 * 10) Help organize and navigate Web portals
 * 11) Manage datasets and ontologies, and
 * 12) Set access permissions and registrations.

In the remainder of this document, for each of these roles, we will see how ontologies affect numerous different parts of the OSF framework. These sections are presented in the order above.

Define Records Descriptions
A central role of ontologies in the Open Semantic Framework is their use to describe any kind of record that gets indexed and managed by the system. Since the framework indexes everything into the RDF data model, ontologies are needed as a schema to describe these RDF resources.

The irON ontology is specifically designed for record descriptions and notations. It interacts with all of the domain and (if used, UMBEL) upper level ontologies.

Inform Interface Displays
Ontologies have an impact in most of the user interfaces that display record information. The property that has the most impact is, which is used to display the label within the user interface that refers to a record or record attributes (properties). This label can be used within text, in a list control, in a tree control, or in any other kind of control that displays references to records.

Note: there are also other properties that are considered as fallbacks to  if a record has no triples using the   property. These include,  ,  , etc.

General User Interface Labels and Descriptions
There are a few properties that have an impact on most of the components of the OSF stack, most of which come from the irON ontology. Here is the list of these irON properties that impact other parts of the system, mainly related to different user interfaces:

User Interface 'Short' Labels
There are a few properties that impact most of the components of the OSF stack. Here is the list of SCO properties that impact other parts of the system, mainly related to different user interfaces:

Hierarchical Displays
The way ontologies define a class or a property structure also has an impact on different kinds of hierarchical displays. An example of this is the "Filter by Kinds" section of the structSearch and structBrowse modules. The possible filters that may be applied to a search query will be displayed to the user according to the hierarchy as defined in the ontologies.

Integrate Heterogeneous Data Sources
The principle reason why the Open Semantic Framework uses RDF and ontologies to describe all the data it indexes and manages is to facilitate data integration from multiple and heterogeneous data sources. The premise of using RDF and ontologies is:

The RDF framework, along with using ontologies as schema, is the most flexible means currently available to describe any kind of data. The RDF-ontology combination can be used to represent any data coming from any other source, data management system, format, or unstructured to structured basis for describing information. (See further the Advantages and Myths of RDF.)

This foundation leads to the extreme flexibility of the Open Semantic Framework. The rationale behind this flexibility, and its benefits, has been described in many locations within this wiki. You may also want to see this article on One of the Semantic Web’s Core Added Value.

Ontologies have a dramatic -- and positive -- impact on the data integration and presentation tasks within an OSF instance.

Define Component Selections
A key aspect of the SCO Ontology is its use as the means to define what semantic components (or widgets) display what types of information within data records.

These assignments are done via the sControl component. The properties for this component define what components may display what type (class) of data records. Here is the list of SCO properties that impact the sControl's behaviors:

Define Component Behaviors
In the Open Semantic Framework, one of the most important roles of ontologies is to define the interaction between different pieces of the system. Because of the extent of these interactions, this section is the longest and most detailed amongst all of the dozen or so ontology roles.

The SCO ontology can have multiple effects on multiple parts of an OSF instance. This section describes those interactions.

sMap Component
The sMap component had different behaviors depending on how its input record is described. Here is the list of SCO properties that will have an impact on the sMap's behaviors:

sWebMap Component
The sWebMap component has different behaviors depending on how its input record is described. Here is the list of SCO and WGS84 properties that impact the behavior of an sWebMap:

sStory Component
The sStory component has different behaviors depending on how its input record is described. Here is the list of SCO properties that impact an sStory component:

sBarChart and sLinearChart Components
The sBarChart and the sLinearChart components exhibit different behaviors depending on how the input records that are enabled for these component types are described. Here is the list of the SCO properties that impact this behavior:

sRelationBrowser
The sRelationBrowser component exhibits different behaviors depending on how its input record is described. Here is the list of SCO properties that impact the sRelationBrowser component:

sDashboard
The sDashboard component exhibits different behaviors depending on how its input record is described. Here is the list of SCO properties that impact the sDashboard component:

Guide Visualization Template Selection
One of the core features of the OSF-Drupal set of Drupal modules is the ability to use different display templates depending on the types of records available. The selection of these templates is based on the types of those records and the type hierarchies described by the OSF ontologies. This section describes how these ontologies guide template selections.

As a refresher on templates and their use, see the Building OSF-Drupal Templates document. It describes how the templating engine works and how to create various templates.

Template Selection
Template selection is the action of binding an instance record to a display template based on its type. Three things are required to make this happen:


 * 1) Instance records have to be typed
 * 2) An ontological structure of type relationships (via  ) has to exist in one or more OSF ontology(ies), and
 * 3) A template has to exist for the type of the instance record.

(Note: a specific template by type is not strictly required, since lacking a specific template for the target type, the system will invoke the nearest template up the parental chain in the governing ontology structure, eventually getting to the most generic template available, that for "thing".)

Impact of Ontologies on Template Selection
OSF-Drupal's templating engine selects record display templates based on the class hierarchy loaded on a OSF instance. It also uses inference on types to select the proper template for a given record.

Let's say that we try to display information about a  instance record. What the system attempts to do is to find a template that displays information about this kind of instance record. First, the  type (class) has to be defined in the ontological structure of the OSF instance; if it is not, then no specific template will be selected and the system will default to using the   template (see below). If the type (class) is found, the system will next check to see if a template exists for that specific type. If one exists one, it will use the matching template. If one does not, it will next select the parent class of the type and try the match again. If it again fails, it will continue its test up the parental chain. If all tests fail, it will use the default  template. Whichever template is selected then becomes the basis for formatting and presenting the visual record display.

We can use a simple class hierarchy, matched to a simple set of available template files, to illustrate how ontologies impact the OSF-Drupal templating system.

Now, let's say that our OSF portal is about to display information about a  record. As we can notice, there is no  template available for a. However, because of the ontology structure, the system next attempts to select a template from a parent class of that.

What the system would do is to check if there is a template available for a record of type. Since there is none, it would try to find one for a parent type, so in this case the  class. In our example, there is now a match. The templating engine thus uses the  template to display information about the   record.

Were the  not to exist, then the templating engine would fall back to the   template, which is considered to be the "generic record display template", or the template of last resort.

This design means that if:


 * 1) the ontological structure changes over time, or
 * 2) new templates get added to the system

then there may be an impact on how the record gets displayed.

The major advantage of this design is that more and more specific formatting templates may be added to an OSF installation over time, both improving the tailored look of results displays and accommodating more structure and relationships as they evolve.

Provide Reasoning and Inference
A standard use of ontologies is for reasoning and inference, and those used by OSF are no exception.

By extension, however, we can also use these same capabilities to check on data consistency and coherence. This is an important feature of the system since the system can detect if there are logical inconsistencies or logical incoherencies that have been developed by the system administrator during ontology growth and development. Having coherent and consistent ontologies means that we have the proper foundations to create consistent and coherent datasets of instance records.

See further the discussion on reasoning using Protégé.

Guide Content Filtering
Filtering data is the action of getting a subset of records from a complete dataset based on some selection criteria. In OSF, the predominant share of filtering is done using the structWSF Search Web service endpoint. The a minority of filtering is done using the SPARQL endpoint. It is also possible to filter via the AGGR aggregation ontology.

Possible filtering criteria for the Search endpoint are:


 * 1) Filtering by type(s)
 * 2) Filtering by attribute(s)
 * 3) Filtering by attribute(s)/value(s)
 * 4) Filtering by geo-localization (within a given geographical area)

These filtering activities are performed by different tools of the stack, such as:


 * structSearch
 * structBrowse
 * sWebMap

These tools are impacted by the definition of the loaded ontologies. The filtering of the values by types, attributes and attributes/values requires an ontology class or an ontology property as filtering criteria.

Filtering with Inference
Also, the any Search query can be performed with  enabled. Just like with the template selection section noted above, inference can have a big impact on the number and nature of returned results. Let's consider this example class structure:

This class structure shows a hierarchy of images where the leaf classes are topical image classes (so classes where their individuals are considered images representing one of the topic: Heritage, Neighborhood and Park). Now let's see how this class structure impacts Search queries, and returned results, by different tools (structSearch, structBrowse, sWebMap and others).

Here is a series of Search queries sent to a structWSF instance that has this class hierarchy loaded, using the sample specification noted above. This tables shows the results potentially returned by the Search endpoint with and without inferencing turned on:

In the Use Case #1, the user requests all of the  without inferencing. This means that the Search endpoint will return all of the records that have been typed as. In this case, the records  got returned.

Use Case #2 is a variant of Use Case #1, only now with inferencing enabled. In this use case, the Search endpoint will return all the  and all the records that are typed with one of its subtypes (in this case,  ). For this query, records  got returned. This case shows where ontologies can have a dramatic impact on the system. If we modify that class hierarchy and put the  as being a sub-class-of , then the same results would be returned for Use Case #2 than we got with Use Case #1.

With Use Case #3, the endpoint does not return any results because inferencing is disabled, and because there is no record typed as.

Use Case #4 is a variant of Use Case #3 where inferencing is enabled. The endpoint returns all the image records because all of them are  by inference on type.

Filtering via the AGGR Ontology
The AGGR Ontology also has an impact on anything that displays facets of filtered searches. Amongst others, it impacts the structSearch and structBrowse OSF-Drupal modules. It also impacts different user interfaces that use the Search Web service endpoint to perform auto-completion tasks.

Tag Concepts in Text Documents
In the Open Semantic Framework, the Scones Web service endpoint is what is used to analyze unstructured text documents, then turning them into semi-structured text documents by automatically tagging concepts. The concept tagging takes place using ontology-based information extraction, or OBIE. Named entity dictionaries are the basis for entity tagging.

These concepts used for the tagging come from selected ontologies loaded on the system. The way these ontologies have been created, and the way the classes and named individuals have been defined, has a dramatic impact on the quality of the documents tagged by Scones.

Scones uses two things from ontologies:


 * its classes
 * its named individuals

Depending on settings, one or both of these sources may be used for scones tagging.

There are a few properties intimately related to the Scones Web service endpoint:

Help Navigate and Organize Web Portals
In OSF, ontologies also have an impact on the general organization of a Web portal and how it is navigated.

Portal Navigation
In an OSF portal, its domain ontologies use the  for general navigation. The relation browser is a tool that lets users dynamically navigate a graph (that is, nodes with arcs that links these nodes). The most widespread usage of the relation browser is to let users navigate the links between ontology concepts. These concepts are the anchor points of what other content is available on the Web portal. By navigating the concepts (classes) structure, users are able to explore an OSF portal's entire content.

Each node in the  semantic component is linked to whatever other kinds of related records exist in the system. Depending on the types of those records, other semantic components can then be invoked to display this tightly related content for each node.

Ontologies thus impact navigation and discovery on an OSF portal in two ways:


 * 1) They impact the navigation of the structure by defining which concepts are linked to other concepts and with what property
 * 2) They define what related records may get displayed to the user based on their classes and properties.

Layouts Organization
OSF Web portals are mainly organized by Layouts. A layout is a specific page presentation format with specific design, components and ordering and sizing of those components. This page presentation is highly influenced by the kind of things indexed in the system. Generally, layouts present records of a certain type (or family of types), along with specialized functions that users are able to use to perform different actions on that set of records.

Here are a few examples of such layouts:


 * Sample Chart Template
 * Sample Image Template
 * Sample Topic Template
 * Sample WebMap Template

These layouts aggregate all of the records of a certain type (like images of all kinds), display them using different kind of tools (like an Images Gallery), and filter them depending on different filtering criteria (like mapping the position where each image got captured, on a map, within a specific neighborhood area).

The ontologies impact the general organization of the Web portal because of the kind of things that are indexed in the system interacting with the available layouts.

Manage Datasets and Ontologies
Basic settings for managing datasets and ontologies is provided by the WSF Ontology. It presently does so via two mechanisms.

Datasets Syncing Framework
The Datasets Syncing Framework behaves differently depending on the value of the  property for each input record.

structOntology
The structOntology OSF-Drupal module exhibits different behavior depending on the value of the  property for each input ontology description.

Set Access Permissions and Registrations
The WSF Ontology also has a principal purpose to describe the internal state of a structWSF instance such as the internal access control records, the datasets descriptions, the registered web service endpoints, etc. As a result, this ontology can have multiple effects on other parts of an OSF instance.

The WSF Ontology is used to describe three main areas of a structWSF installation:


 * 1) datasets registry
 * 2) access definition registry
 * 3) registered web services endpoints registry

These registries are hosted in some specialized datasets in the triple store (Virtuoso for most OSF installations). The information indexed in these different registries is defined using the WSF ontology.

All structWSF Web services are affected by these registries.