Adding an Ontology Concept using Protégé

From OSF Wiki
Jump to: navigation, search

This guide provides basic steps for adding a new concept to an ontology using the Protégé ontology editor. See also the document structOntology v Protégé?

Why Add a New Concept?

Let's assume, possibly because of adding new data to your system, that you have decided you need to add a new concept and some attributes to your ontology. You are doing so because you observe that, to properly include your new data in your system, that you are missing a "bridging concept" between an existing concept ("parent") already in the ontology, as well as some attributes (data) that describe that concept. This basic gap can be shown as follows:


We will also take as our example the Winnipeg ("Peg") community indicator system as the testbed, using the same concepts and attributes as presented in the accompanying Adding a New Dataset example. In this exercise, we will be adding:

  • A new class (Component) of Single Family Dwellings that is a child concept of the existing Housing Component
  • A parallel instance (Individual) of Single Family Dwellings (because we are following the metamodeling recipe)
  • Because our design specifies adding "standard" data (and not unique indicator data as is mostly contained in the current model), we will also add:
    • A new Standard Data class, with accompanying individual instance
    • New predicates (properties) for relating this standard data to the existing structure (components and themes)
  • Then, we add the actual data attribute of single family dwelling housing starts ('SFDHousingStarts') to the system.

The exercise concludes with some consistency testing of the new additions and some general comments on ontology editing and expansion practice.

Starting Protégé

See the basic Protégé guide for how to start up and use Protégé.

Adding a Class

In most cases, a domain concept is the same as a class in the OWL ontology language. Thus, we begin, after firing up the system by adding our new "bridging" component, the concept of Single Family Dwellings. Further, this concept is also a sub-concept under an existing component in the system, Housing.

Classes are the major building blocks ("nouns") within your ontology. (Depending on whether you are also metamodeling with your ontology, what you enter in the Classes tab also gets reflected in the Individuals tab; see below.)

So we begin by going to the Classes tab:

P classes tab.png

(Note: for some of the steps below, you may want to have already worked through basic editing with Protégé if you have not already done so.)

There are two methods available to create a new class in Protégé: do it all by hand, or duplicate ("clone") an existing class. We'll start with the latter.

The Cloning Method

First, try to find an existing class very closely related to the new one you wish to create. Since in our example we are planning a new sub-class under Housing, that concept is a good candidate.

You first highlight your source class in the left-hand tree and then pick the 'Duplicate selected class ...' option off of the Edit main menu:

P duplicate class.png

You are requested via popup to name the new class. Since our new concept is for Single Family Dwellings, we will follow our best naming practices using CamelCase style for classes and name it 'SingleFamilyDwellings':

P duplicate class popup.png

That will cause the new class to appear at the top of the listing (under Components in this case):

P duplicate class move1.png

And, because we want this to be a sub-concept under Housing, we drag-and-drop the new name to that position:

P duplicate class move2.png

What we now see under the rest of the sub-panes is an exact duplicate of the source Housing specifications. So, we begin by updating all of these entries to now make them specific to Single Family Dwellings.

We first update all annotations, including labels, definitions, comments, external link references (seeAlso), etc.:

P duplicate class annotate.png

Then, we carefully look at all duplicated class relationships and make sure those exactly apply to Single Family Dwellings (if there are differences from Housing), again using our standard Protégé editing techniques:

P duplicate class description.png

We are now temporarily done with the Class tab, until we create the parallel instance (Individual) for this concept.

The 'From Scratch' Method

Alternatively, we could enter all of these duplicated values from scratch. The same areas that needed to be modified in the previous method need to be entered from scratch (following the same screens as above). The disadvantage of the method from scratch is the possibility for introducing typos and other mismatch errors.

On the other hand, there may not be suitable existing entities from which to clone. In those cases, all information as outlined above should be entered directly.

'Punning' the Individual

Again, because we are following the metamodel recipe, we also need to create a parallel instance Individual for the class.

We go to the Individuals tab and pick the 'Add Individuals' icon in the header of the left-hand Individuals panel. This causes a popup to appear.

In the popup, you need to enter the EXACT name as the class used in the previous step. In this example, it is 'SingleFamilyDwellings':

P individual.png

As you complete typing, you will see a warning message that you are "punning" a current name. That is exactly what we want to do, so continue to proceed and then 'OK' your new entry.

Now, when you pick the newly entered SingleFamilyDwellings item from the Individuals list, you will see that all of your previous class annotations have also been duplicated for this new individual listing (see top highlight):

P individual entries.png

This "punning" saves much time in duplicating this metadata.

However, we still must now enter other specific entries for this item, as shown by the two highlighted areas above to the lower right.

The first category of information that we need to provide regards the 'type' classification for this entry. ('Types' are the same thing as classes.) The idea of the instance of SingleFamilyDwellings is that it is a type of Component (as was its parent, Housing) and is itself an instance of the concept of SingleFamilyDwellings.

To make these assignments, we click on the green plus icon for Types, which brings up the standard object editor popup. We pick the 'class hierarchy; tab (which is chosen by default) and expand our tree until we see the items of Component or SingleFamilyDwelling that we want to assign:

P individual types.png

After we select our desired assignment in the tree, press OK.

We repeat this process until all Type assignments have been made (two for this example, Components and SingleFamilyDwellings).

Now, we are ready to next assign properties to this individual.

We follow a similar process to what we did for Types, except the popup and specific assignment steps differ.

We begin by bringing up the property assignment editor by choosing the green plus icon for the Object property assertions. That presents us with a two-panel popup.

Recall that object properties act to connect two things via the standard triple of subject-predicate-object. In this case, our subject is what we are characterizing, the idea of the individual instance of SingleFamilyDwellings. What we are assigning in this popup window, then, are the predicate (that is, the property, which is shown in the left-hand panel) and the target (or object, which is shown in the right-hand panel). This is why two choices need to be made.

You may want to look at other individual instances, especially closely related ones, for what types of assignments they have.

Do not assign the subject itself as an object!
P individual properties.png

You need to repeat this predicate-object assignment process for all desired connections.

This now completes our work with the Individuals tab. To complete the steps of adding a new class, we lastly need to now assign this individual to its parallel class.

So, returning to the Class tab, pick the Member green icon and via its popup, now assign its identically named individual:

P duplicate class member.png

We have now completed adding a class with a punned individual according to our metamodeling recipe.

Creating the Standard Data Class

If you recall from the lead in, we also had decided that we wanted our new data attribute of single-family housing starts data to be handled differently than the indicator-type data already in the system.

In order to accommodate this difference, we will define a new category of data sources, then, which we will call 'Standard Data' (as opposed to Indicators). To do so, we first define a new class in the Class tab:

P new data class.png

And then, we define the new class under it which is the home for the housing starts data, which we will name 'SFDHousingStarts':

P new data class class.png

Now, to be able to tie this data into the existing structure (which we suspect we may want to handle differently than an Indicator, so we decide to create a new, distinct relationship), we also need to create new properties. These should be object properties, since they are designed to relate a data attribute to a component and vice versa.

We thus decide we should name our predicates, and with inverse properties depending on how we need to describe things, as follows:

  isCharacteristicOf    <Component>
   <Component>    hasCharacteristic  <data attribute>

And, of course, in our specific example, the <data attribute> refers to SFDHousingStarts.

To make these assignments, we go to the Object Properties tab, which has a similar layout and format as the other tabs we are now familiar with:

P new data class properties.png

And, we provide any other desired descriptive information and other specifications of our properties as appropriate.[1]

OK. So, we now have set up our new classes and properties that have expanded the language of our current ontology.

Creating the New Data Attribute

Our last set-up step is to now add the actual SFDHousingStart data attribute.

We had earlier declared its class name, and had located it in the class hierarchy.

Now we need to complete the various class descriptions and annotations (these screens are not shown; follow the process above).

Then, as we described under "punning" the individual, we need to create a duplicate name so that we can inherit the annotations, and proceed to describe the types and object properties for the SFDHousingStart data attribute. As this screen shows, and the steps above described, the resulting screen shows the inherited annotations (upper highlight) and the entered information for types and object properties (two lower highlights):

P new data class individual.png

We have now completed entering all new specifications.

Testing the Structure

Before accepting this structure as final, it is important to test it for consistency.

To do so, please follow the basic steps outlined for using reasoners with Protégé.

Some Pitfalls

Protégé is an open source tool that has occasional quirks and crashes. It also is not very intuitive and imposes way too many steps for certain activities. Out-of-sequence stuff can create intermediate entities that need to be later cleaned up. More specifics and material in this guide may also be useful.

Protégé is thus best used for incremental updates and maintenance via incremental additions of classes and instances, all accompanied by consistency testing and inferencing with reasoners.

Clearly, better tools for non-ontologists and specific to particular tasks at hand are warranted. Elsewhere a better normative landscape for ontology tools is presented and discussed.

Nonetheless, Protégé is a capable tool that, with some learning and familiarization, can be pressed into service for general ontology maintenance.

Working Directly with the Ontology File

Though more technical, large scale changes or additions to an ontology are probably best effected today by direct editing of the ontology files and scripts. These types of changes, however, require a much greater familiarity with ontology languages and serializations.

Ontology maintainers that experience frustration using Protégé are encouraged to find a preferred ontology format (RDF/XML, Manchester syntax, or N3/Turtle) and to study their own and other exemplar ontologies whenever possible. Then, when larger-scale changes are needed, they can be done directly in the ontology itself.

For More Information

With this basic introduction, you are advised to check out the more advanced Protégé manual, which comprehensively works through a full ontology example in the course of the manual. Another useful source is the general intro page to Protégé 4 user documentation.

You may also want to check out the general Protégé category for other use guides on this wiki.

  1. For more information about the role and use of properties, see Sections 4.4 to 4.7 in the Protégé manual. The manual also has many sections providing guidance on property restrictions and interactions, which are also essential reading.