Ontology Development Methodologies

The development of ontologies goes by the names of  ontology engineering or ontology building, and can also be investigated under the rubric of ontology learning. This paper summarizes key papers and links to this topic.

For the last twenty years there have been many methods put forward for how to develop ontologies. These methodological activities have actually diminished somewhat in recent years.

The main thrust of the papers listed herein is on domain ontologies, which model particular domains or topic areas. (As opposed to reference, upper or theoretical ontologies, which are more general or encompassing.) Also, little commentary is offered on any of the individual methodologies; please see the referenced papers for more details.

General Surveys
One of the first comprehensive surveys was done by Jones et al. in 1998. This study began to elucidate common stages and noted there are typically separate stages to produce first an informal description of the ontology and then its formal embodiment in an ontology language. The existence of these two descriptions is an important characteristic of many ontologies, with the informal description often carrying through to the formal description.

The next major survey was done by Corcho et al. in 2003. This built on the earlier Jones survey and added more recent methods. The survey also characterized the methods by tools and tool readiness.

More recently the work of Simperl and her colleagues has focused on empirical results of ontology costing and related topics. This series has been the richest source of methodology insight in recent years. More on this work is described below.

Though not a survey of methods, one of the more attainable descriptions of ontology building is Noy and McGuinness' well-known Ontology Development 101. Also really helpful are Alan Rector's various lecture slides on ontology building.

However, one general observation is that the pace of new methodology development seems to have waned in the past five years or so. This does not appear to be the result of an accepted methodology having emerged.

Some Specific Methodologies
Some of the leading methodologies, presented in rough order from the oldest to newest, are as follows:


 * Cyc - this oldest of knowledge bases and ontologies has been mapped to many separate ontologies. See the separate document on the Cyc mapping methodology for an overview of this approach
 * TOVE (Toronto Virtual Enterprise) - a first-order logic approach to representing activities, states, time, resources, and cost in an enterprise integration architecture
 * IDEF5 (Integrated Definition for Ontology Description Capture Method) - is part of a broader set of methodologies developed by Knowledge Based Systems, Inc.
 * ONIONS (ONtologic Integration Of Naive Sources) - a set of methods especially geared to integrating multiple information sources, with a particular emphasis on domain ontologies
 * COINS (COntext INterchange System) - a long-running series of efforts from MIT's Sloan School of Management
 * METHONTOLOGY - one of the better known ontology building methodologies; however, not many known uses
 * OTK (On-To-Knowledge) was a methodology that came from the major EU effort at the beginning of last decade; it is a common sense approach reflected in many ways in other methodologies
 * UPON (United Process for ONtologies) - is a UML-based approach that is based on use cases, and is incremental and iterative.

Please note that many individual projects also describe their specific methodologies; these are purposefully not included. In addition, Ensan and Du look at some specific ontology frameworks (e.g., PROMPT, OntoLearn, etc.) from a domain-specific perspective.

Some Flowcharts
Here is the general methodology as presented in the various Simperl et al. papers [c.f., Fig. 1 in ]:



The Corcho et al. survey also presented a general view of the tools plus framework necessary for a complete ontology engineering environment [Fig. 4 from ]:

There are more examples that show ontology development workflows. Here is one again from the Simperl et al. efforts [Fig. 2 in ]:

However, what is most striking about the review of the literature is the paucity of methodology figures and the generality of those that do exist. From this basis, it is unclear what the degree of use is for real, actionable methods.

Best Practices Observations
The Simperl and Tempich paper, besides being a rich source of references, also provides some recommended best practices based on their comparative survey. These are:

General Recommendations

 * Enforce dissemination, e.g.. publish more best practices
 * Define selection criteria for methodologies
 * Define a unified methodology following a method engineering approach
 * Support decision for the appropriate formality level given a specific use case

Process Recommendations

 * Define selection criteria for different knowledge acquisition (KA) techniques
 * Introduce process description for the application of different KA techniques
 * Improve documentation of existing ontologies
 * Improve ontology location facilities
 * Build robust translators between formalisms
 * Build modular ontologies
 * Define metrics for ontology evaluation
 * Offer user oriented process descriptions for ontology evaluation

Organizational Recommendations

 * Provide ontology engineering activity descriptions using domain-specific terminology
 * Improve consensus making process support

Technological Recommendations

 * Provide tools to extract ontologies from structured data sources
 * Build lightweight ontology engineering environments
 * Improve the quality of tools for domain analysis, ontology evaluation, documentation
 * Include methodological support in ontology editors
 * Build tools supporting collaborative ontology engineering.

Summary of Observations
This review has not set out to characterize specific methodologies, nor their strengths and weaknesses. Yet the research seems to indicate this state of methodology development in the field:


 * Very few discrete methods exist, and those that do are relatively older in nature
 * The methods tend to either cluster into incremental, iterative ones or those more oriented to more comprehensive approaches
 * There is a general logical sharing of steps across most methodologies from assessment to deployment and testing and refinement
 * Actual specifics and flowcharts are quite limited; with the exception of the UML-based systems, most appear not to meet enterprise standards
 * The supporting toolsets are not discussed much, and most of the examples are based solely on a governing tool. Tool integration and interoperability is almost non-existent in terms of the narratives
 * This does not appear to be a very active area of current research.