Metamodeling in Domain Ontologies

It is not unusual to want to treat things either as a class or an instance in an ontology, depending on context. Among other aspects, this is known as metamodeling and it can be accomplished in a number of ways. However, the newest version of the Web Ontology Language, OWL 2, provides a neat trick for doing this called "punning". Why one would want to metamodel, how to specify it in an ontology, and why the OWL 2 approach is helpful are described in piece.

Why Metamodel?
Lightweight, domain ontologies have been the focus of this ontology series. Domain ontologies are the "world views" by which organizations, communities or enterprises describe the concepts in their domain, the relationships between those concepts, and the instances or individuals that are the actual things that populate that structure. Thus, domain ontologies are the basic bread-and-butter descriptive structures for real-world applications of ontologies.

These lightweight, domain ontologies often have a hierarchical structure for which SKOS (Simple Knowledge Organization System) is a recommended starting ontology (see best practices recommendations). A subject concept reference ontology such as UMBEL (Upper Mapping and Binding Exchange Layer), which we also recommend, also has a similar structure and a heavy reliance on SKOS in its vocabulary. Because of these structural similarities, ontologies that use SKOS or UMBEL are therefore good candidates for using metamodeling techniques.

To better understand why we should metamodel, let's look at a couple of examples, both of which combine organizing categories of things and then describing or characterizing those things. This dual need is common to most domains.

For the first example, let's take a categorization of apes as a kind of mammal, which is then a kind of animal. In these cases, ape is a class, which relates to other classes, and apes may also have members, be they particular kinds of apes or individual apes. Yet, at the same time, we want to assert some characteristics of apes, such as being hairy, two legs and two arms, no tails, capable of walking bipedally, with grasping hands, and with some endangered species. These characteristics apply to the notion of apes as an instance.

As another example we may have the category of trucks, which may further be split into truck types, brands of trucks, type of engine, and so forth. Yet, again, we may want to characterize that a truck is designed primarily for the transport of cargo (as opposed to automobiles for people transport), or that trucks may have different drivers license requirements or different license fees than autos. These descriptive properties refer to trucks as an instance.

These mixed cases combine both the organization of concepts in relation to one another and with respect to their set members, with the description and characterization of these concepts as things unto themselves. This is a natural and common way to express most any domain of interest.

The practice has been to express these mixed uses in RDFS or OWL Full, which makes them easy to write and create since most "anything goes" (a loose way of saying that the structures are not decidable). A good explanation of this can be found in Rinke J. Hoekstra, 2009. Ontology Representation: Design Patterns and Ontologies that Make Sense, thesis for Faculty of Law, University of Amsterdam, SIKS Dissertation Series No. 2009-15, 9/18/2009. 241 pp. See http://dare.uva.nl/document/144859. In that, Hoekstra states (pp. 49-50):

RDFS has a non-fixed meta modelling architecture; it can have an infinite number of class layers because rdfs:Resource is both an instance and a super class of rdfs:Class, which makes rdfs:Resource a member of its own subset (Nejdl et al., 2000). All classes (including rdfs:Class itself) are instances of rdfs:Class, and every class is the set of its instances. There is no restriction on defining sub classes of rdfs:Class itself, nor on defining sub classes of instances of instances of rdfs:Class and so on. This is problematic as it leaves the door open to class definitions that lead to Russell’s paradox (Pan and Horrocks, 2002). The Russell paradox follows from a comprehension principle built in early versions of set theory (Horrocks et al., 2003). This principle stated that a set can be constructed of the things that satisfy a formula with one free variable. In fact, it introduces the possibility of a set of all things that do not belong to itself. . ..

In RDFS, the reserved properties rdfs:subClassOf, rdf:type , rdfs:domain and rdfs:range are used to define both the other RDFS modelling primitives themselves and the models expressed using these primitives. In other words, there is no distinction between the meta-level and the domain. Use of sub-class relationships also enables tree-like hierarchies to be constructed and some minor inferencing (such as one concept is broader than another concept, one of the contributions of SKOS).

But such mixed uses do not allow more capable OWL reasoners to be applied, nor for the full power of query or search abstraction to be applied, nor for the ontology to be checked for consistency. These limits may be fine in many circumstances, but their lack does allow structures to evolve that may become incoherent or illogical. If data interoperability is a goal, as it is in our enterprise use cases, incoherent ontologies can not contribute or participate as structures to linking datasets. At most -- and this is the case for much linked data practice -- all that can be done is to make explicit pairwise connections between different dataset objects. This is not efficient and defeats the whole purpose of leveraging schema. OWL 2 has been designed to fix that (in addition to other benefits ).

The approach taken by OWL 2 to overcome some of these metamodeling limitations is through "punning". Recall that objects are named in RDF with URIs (IRIs in OWL 2). The trick with "punning" is to evaluate the object based on how it is used contextually; the IRI is shared but its referent may be viewed as either a class or instance based on context. Thus, objects used both as concepts (classes) and individuals (instances) are allowed and standard OWL 2 reasoners may be used against them.

It should be noted, however, that this "punning" technique does not support the full range of possible metamodeling aspects. Like any language, there is a trade-off in OWL 2 between expressivity and reasoning efficiency. But, for lightweight, domain ontologies where the objective is interoperability across heterogeneous sources -- that is, namely the main objectives of the semantic Web or semantic enterprise -- this trade-off in OWL 2 now appears to be well balanced. Moreover, its automatic detection by tools such as Protégé 4 that use the OWL API also means it is comparatively easy to use and implement.

Recommended Best Practices
A fundamental aspect of best practices for ontology building and maintenance is the desirability of keeping instance data (ABox) separate from the conceptual structure (TBox) that provides the schema of relationships for those concepts. Fortunately, this approach also integrates well with the metamodeling capabilities in OWL 2.

How metamodeling and the ABox-TBox split is accommodated is shown by this diagram, using trucks as an example:



The right-hand side of the diagram shows the two views possible via OWL 2 metamodeling in the TBox. In some cases, we may speak of trucks as a class of vehicle, to which individual members may belong; this is the class view. In other contexts, we may want to characterize or make assertions about trucks in our ontology, such as asserting cargo transport or engine type, in which case truck is now represented as an instance (individual) under the individual view. These two views in the TBox represent our structural and conceptual description (the "world view") regarding this domain of which vehicles and trucks are a part.

Then, when we begin to populate our knowledge base with specific data, we do so via the ABox. In this example, as we add data about the specific brand of Ford trucks and their attributes, we link the Ford instance to the TBox via the Truck class. (Best practice also requires that we model this new attribute structure into the TBox as well, but that is a different topic.)

How Punning is Triggered in OWL 2
Punning is not triggered by annotation properties. Annotation properties applied to a class merely act as additional description or metadata about that class; the annotation property by definition does not participate in any inferencing or reasoning. You should also know that in OWL 2, certain predicates (properties) such as label, comment or description (among others) are reserved as annotation properties.

You can invoke the OWL 2 punning process directly or via context when your ontologies are processed with the OWL API. The basic rule to follow is:

' Any entity declared as a class and'' with an asserted object or data property is punned (metamodeled). '''

This test is done directly by the OWL API. You can go ahead and test this out with an OWL 2-compliant editor, such as Protégé 4. Here is an example test (in N3 notation):

First, begin with some initial declarations:

foo:Car a owl:Class.

foo:Animal a owl:Class ; owl:disjointWith foo:Car.

Then, let's describe an object property:

foo:isEndangered a owl:ObjectProperty ; rdf:domain foo:Animal ; rdf:range bar:SomeSpecies.

And define and make an assertion about Apes:

foo:Ape a owl:Class ; foo:isEndangered bar:SomeSpecies.

Now, the system begins by testing for punning and other checks, such as:


 * 1) isEndangered an annotation property? no
 * 2) what is its domain? foo:Animal
 * 3) this will detect and infer:

foo:Ape a owl:Class ; foo:Ape a foo:Animal ; foo:isEndangered bar:SomeSpecies.

<li>punning is triggered because non-annotation property has been applied to a class</li> <li>non-annotation properties are now assigned to named individual (which captures individual view part of the TBox above)</li> <li>then, can check for inconsistencies depending on the restriction(s) applied to the <font face="Courier New,Courier,monospace">foo:Animal class.</li></ol>

In this case, no inconsistencies were found.

But, let's now add another object (non-annotation) property:

foo:hasBrand a owl:ObjectProperty ; rdf:domain foo:Car ; rdf:range bar:SomeBrand.

And use it to expand our assertions about Apes:

foo:Ape a owl:Class ; foo:isEndangered bar:SomeSpecies ; foo:hasBrand bar:Ford.

And repeat #3:

foo:Ape a owl:Class ; foo:Ape a foo:Animal ; foo:Ape a foo:Car ; foo:isEndangered bar:SomeSpecies ; foo:hasBrand bar:Ford.

 Now , inconsistencies are raised in #6:

So, the consistency check fails, because Ape can not be both an Animal and a Car.

While this is clearly a silly example, such checks are quite important as the number of objects and assertions grows in an ontology.

What Does Punning Look Like?
The punning technique works because the IRI for the object ends up being treated as both a concept (class) and an instance (individual). Thus, while the object shares the same IRI, depending on its context, it is evaluated by an OWL reasoner as a different thing (class or individual). The OWL API achieves this by actually writing out the object in both its class view and individual view. Here is an example (in RDF/XML serialization):

Input OWL:

<owl:Class rdf:about="http://purl.org/ontology/Ape>  <isEndangered>Ape</isEndangered> </owl:Class>

Output from Protégé with punning:

<owl:Class rdf:about="http://purl.org/ontology/Ape"/>

<owl:NamedIndividual rdf:about="http://purl.org/ontology/Ape"> <isEndangered>Ape</isEndangered> </owl:NamedIndividual>

Notice the duplicate definition (in RDF/XML) to the <font face="Courier New,Courier,monospace">NamedIndividual. When writing out the ontology, all punned objects are duplicated in a similar manner.

The Beginning of the Transition
OWL 2 and its other general changes have arrived in the nick of time. Not only were we seeing some of the weaknesses in OWL 1 that warranted updating, but we are also now being challenged with regard to how to make linked data and the many datasets in RDF effectively interoperate. Perhaps undecidability and throwing triples to the wind worked OK in the early days of our semantic Web Wild West. But now it is time for the new sheriff to bring order to the emerging chaos.

Of course only time will tell, but the design decisions made by the OWL 2 working group appear judicious and balanced to find that sweet spot between expressiveness and reasoning efficiency. It also appears that many domain vocabularies based on SKOS would benefit from embracing the OWL 2 metamodeling techniques.

But two criticisms still remain. First, tooling support for OWL 2 and the OWL API is weak, as discussed in normative tools landscape. And, as the best practices survey discusses, there are not enough practitioners that have yet taken up OWL 2, which means that best practice guidance and exemplars are still limited.

Lightweight domain ontologies can greatly benefit from these OWL 2 metamodeling techniques and the OWL RL alternative that also emerged as one of the OWL 2 profile enhancements. The growing scale and learning taking place around linked data and RDF datasets is now pointing the way to a necessary transition. And OWL 2 metamodeling should be one of the key components to making our semantic technologies more responsive and effective.