Adding a New Dataset
This guide provides basic steps for how to add and then integrate a new dataset into your local instance using various Open Semantic Framework tools.
The Example Case
We will take as our example adding a new dataset of single-family dwelling housing starts for the year 2006 by hypothetical neighborhood. The introduced data values are submitted under the "foo" namespace. We also introduce at the conclusion of the example that the attributes introduced by this data are new, and need to also be accommodated in the governing ontology of the instance.
Preparing Up the Dataset
The dataset is prepared up as a standard instance record object notation (irON), using the comma-delimited (spreadsheet) CSV format called commON. For more information on commON and how to define datasets in it, see the separate commON case study.
The example dataset can be downloaded (SFD_housing_starts.csv)for local inspection; it appears as this:
The basic layout begins with a definition of the dataset and its metadata (&&dataset), and then presents the actual data records(&&recordList). See further the commON case study.
Import a Dataset
Click on the top
Configuration menu item. Then, you have to click the
Configure OSF for Drupal modules.
To import a dataset, you simply have to click the
+ Import Dataset link on the
DATASETS & NETWORKS tab.
The Import Dataset page will let you import a dataset serialized in one of the following formats:
What you have to specify to import a new dataset is:
Dataset file to import
- Select the RDF file you want to import from your local computer
- Select the type of RDF file you are trying to import
- Define the name of the Dataset you are importing
Dataset description (Optional)
- Optionally define the description of that dataset
Custom Dataset URI (Optional)
- Define the URI of the dataset. If you don't provide any URI, then OSF for Drupal will create one for you
Save dataset on this network
- Choose on which OSF Web Services endpoint you want to import that dataset
Which role should have full permissions on this dataset
- Determine which users' role should have full CRUD permissions on this dataset
Then you only have to click the
Import button to start the dataset importation process.
At this point, the dataset got created into the OSF instance. All the content of the dataset file you imported as been indexed in that newly created dataset.
Once the dataset is imported, you will get redirected to a new page. If you checked the
Check attributes and types existence option, then you would be seeing the possible warnings on that page. If you didn't, then the user interface is asking you to click the
Expose Imported Dataset button. The only thing you have to do is to click on that button to get redirected to the form you have to fill to expose the dataset to Drupal.
The last step is to expose the dataset you just imported into the OSF instance to Drupal. If you skip this step, then the dataset will be on the OSF instance, but it won't be usable to any OSF for Drupal module.
- This is the name you want to give to this dataset. This name is local to this Drupal instance. It will be used to refer to the dataset within the user interface of this Drupal portal
Dataset is searchable
- This specifies if you want to have this dataset searchable by the OSF SearchAPI module. If this option is unchecked then the content of this dataset won't participate into the seaches performed by the OSF SearchAPI module
Once you are done, you simply have to click the
Save button to expose this newly imported dataset to Drupal.
Now you can see the newly imported dataset in the list of accessible datasets.
Conceptual Implications of the Dataset
Now that the dataset has been added, we need to make sure that it is properly modeled and linked into the domain ontology guiding your specific instance. (If you are not already familiar with them, you may want to see the other background material regarding ontologies on this wiki.)
You may find, for example, that to properly include your new data in your system, that you are missing a "bridging concept" between an existing concept ("parent") already in the ontology, as well as some attributes (data) that describe that concept.
Let's say, for example, that our existing ontology has the concept of housing, but not the concept of single-family dwellings or the specific data attributes captured by our 'SFD Housing Starts' data. The basic conceptual gap this represents appears as follows, with housing representing the "parent" concept and single-family dwellings the "child":
Integrating the Dataset with the Ontology
Because of the conceptual implications noted above, some changes to the existing ontology need to be made in order to effect this integration. Please see the following guide on Adding an Ontology Concept using Protégé for the next steps in this process.
Dealing with Missing Attributes and Types
If you are using structImport to import a dataset and that the option "Check for missing attributes and types in the imported dataset." is enable, or if your importation script support that functionality, then each time you import a new dataset, the system will tell you which attributes, or types, used in the dataset are missing in the ontologies structure currently used by the OSF instance.