Workflow Perspectives on OSF
Other materials describe the overall architecture or layered approach of the Open Semantic Framework (OSF). These views are useful, but lack a practical understanding of how the pieces fit together or how they are developed and maintained.
This document presents a series of seven different workflows for various aspects of developing and maintaining an OSF (based on Drupal). In addition, each workflow section also cross-references other key documentation on this zWiki as well as points to possible tools that might be used for conducting each specific workflow.
Seven different workflows are described in this document, as shown in the diagram below. Each of the workflows is color-coded and related to the other workflows. The basic interaction with an OSF instance tends to occur from left-to-right in the diagram, though the individual parts are not absolutely sequential. As each of the seven specific workflows is described below, it is keyed by the same color-coded portion of the overall workflow.
Each of the component workflows is itself described as a series of inter-relating activities or components.
Installation is mostly a one-time effort and proceeds in a more-or-less sequential basis. As various components of the stack are installed, they are then configured and tested for proper installation.
The installation guide is the governing document for this process, with quite detailed scripts and configuration tests to follow. The blue bubbles in the diagram represent the major software components of Virtuoso (RDF triple store), Solr (full-text search) and Drupal (content management system).
Another portion of this workflow is to set up the tools for the backoffice access and management, such as PuTTY and WinSCP (among others).
Further details on the steps in this portion of the workflow is provided by the Installation Workflow document and its associated subsidiary documents.
Configure & Presentation Workflow
The three major clusters of effort in this workflow are the design of the portal, including a determination of its intended functionality; the setting of the content structure (stubbing of the site map) for the portal; and determining user groups and access rights. Each of these, in turn, is dependent on one or more plug-in modules to the Drupal system.
Some of these modules are part of the OSF-Drupal series of OSF modules, and others are evaluated and drawn from the more than 8000 third-party plug-in modules to Drupal.
The Design aspect involves picking and then modifying a theme for the portal. These may start as one of the open source existing Drupal themes, as well as those more specifically recommended for OSF. If so, it will likely be necessary to do some minor layout modifications on the PHP code and some CSS (styling) changes. Theming (skinning) of the various OSF widgets (see below) also occur as part of this workflow.
The Content Structure aspect involves defining and then stubbing out placeholders for eventual content. Think of this step as creating a site map structure for the OSF site, including major Drupal definitions for blocks, Views and menus. Some of the entity types are derived from the named entity dictionaries used by a given project.
More complicated User assignments and groups are best handled through a module such as Drupal's Organic Groups. In any event, determination of user groups (such as anonymous, admins, curators, editors, etc.) is a necessary early determination, though these may be changed or modified over time.
For site functionality, Modules must be evaluated and chosen to add to the core system. Some of these steps and their configuration settings are provided in the guidelines for setting up Drupal document.
None of the initial decisions "lock in" eventual design and functionality. These may be modified at any time moving forward.
Further details on the steps in this portion of the workflow is provided by the Configuration Workflow document and its associated subsidiary documents.
Structured Data Workflow
Of course, a key aspect of any OSF instance is the access and management of structured data.
There are basically two paths for getting structured data into the system. The first, involving (generally) smaller datasets is the manual conversion of the source data to one of the pre-configured OSF import formats of RDF, JSON, XML or CSV. These are based on the irON notation; a good case study for using spreadsheets is also available.
The second path (bottom branch) is the conversion of internal structured data, often from a relational data store. Various converters and templates are available for these transformations. One excellent tool is FME from Safe Software (representing the example shown utilizing a spatial data infrastructure (SDI) data store), though a very large number of options exist for extract, transform and load.
In the latter case, procedures for polling for updates, triggering notice of updates, and only extracting the deltas for the specific information changed can help reduce network traffic and upload/conversion/indexing times.
Further details on the steps in this portion of the workflow (including some portions of updating ontologies) is provided by the Datasets Workflow document and its associated subsidiary documents.
The structured data from the prior workflow process is then matched with the remaining necessary content for the site. This content may be of any form and media (since all are supported by various Drupal modules), but, in general, the major emphasis is on text content.
Existing text content may be imported to the portal or new content can be added via various WYSIWYG graphical editors for use within Drupal. (The excellent WYWIWYG Drupal module provides an access point to a variety of off-the-shelf, free WYSIWYG editors; we generally use TinyMCE but multiples can also be installed simultaneously).
The intent of this workflow component is to complete content entry for the stubs earlier created during the configuration phase.
Content that is tagged by the OSF tagger is done so based on the concepts in the domain ontology (see below) and the named entities (as contained in "dictionaries") used by a given project. Once tagged, this information can also now be related to the other structured data in the system.
Once all of this various content is entered into the system, it is then available for access and manipulation by the various OSF-Drupal modules (see figure above) and OSF widgets (see below).
Further details on the steps in this portion of the workflow is provided by the Content Workflow document and its associated subsidiary documents.
Though the flowchart below appears rather complicated, there are really only three tasks that most OSF administrators need worry about with respect to ontologies:
- Adding a concept to the domain ontology (a class) and setting its relationships to other concepts
- Adding a dataset attribute (data characteristic) for various dataset records, or
- Adding or changing an annotation for either of these things, such as the labels or descriptions of the thing.
In actuality, of course, editing, modifying or deleting existing information is also important, but they are easier subsets of activities and user interfaces to the basic add ("create") functions.
The OSF interface provides three clean user interfaces to these three basic activities.
These basic activities may be applied to the three major governing ontologies in any OSF installation:
- The domain ontology, which captures the conceptual description of the instances's domain space
- The semantic components ontology (SCO), which sets what widgets may display what kinds of data, and
- irON for the instance record attributes and metadata (annotations).
All of the OSF ontology tools work off of the OWLAPI as the intermediary access point. The ontologies themselves are indexed as structured data (RDF with Virtuoso) or full text (Solr) for various search, retrieval and reasoning activities.
Because of the central use of the OWLAPI, it is also possible to use the Protégé editor/IDE environment against the ontologies, which also provides reasoners and consistency checking.
Further details on the steps in this portion of the workflow is provided by the Ontologies Workflow document and its associated subsidiary documents.
Filter & Select Workflow
The filter and select activities are driven by user interaction, with no additional admin tools required. This workflow is actually the culmination of all of the previous sequences in that it exposes the structured data to users, enables them to slice-and-dice it, and then to view it with a choice of relevant OSF widgets.
With respect to the OSF widgets, here is its workflow, shown as an animation:
Considerable more detail and explanation is available for these OSF widgets.
Because this portion of the workflow is largely self-contained and user-driven, the associated Filter & Select Workflow document contains no new steps, but only links to some useful, subsidiary documents.
The ongoing maintenance of an OSF instance is mostly a standard Drupal activity. Major activities that may occur include moderating comments; rotating or adding new content; managing users; and continued documentation of the site for internal tech transfer and training. If the portal embraces other aspects of community engagement (social media), these need to be handled as part of this workflow as well.
All aspects of the site and its constituent data may be changed, or added to at any time.
Further details on the steps in this portion of the workflow is provided by the Maintenance Workflow document and its associated subsidiary documents.
- The current release of OSF does not yet have these components included; they will be released to the open source SVN by early summer.