Archive 1.x:OSF-Drupal and structWSF Interaction

The OSF-Drupal Drupal modules rely on the structWSF Web service framework to manage and publish structured data content. All actions performed using any OSF-Drupal module ends up being a series of queries sent to various structWSF Web service endpoints. This article describes how a OSF-Drupal node is interacting with one, or multiple, structWSF instances; and, it shows all of the internal registries used to manage that interaction.

OSF-Drupal Is A Proxy
An OSF-Drupal node can be understood as being a proxy between a user and a structWSF instance. More precisely, a OSF-Drupal module should be understood as a User Interface Proxy; that is, a proxy that has a user interface that generates queries that are sent to one, or multiple, structWSF instances.

But in any case, all queries sent by OSF-Drupal are sent on the behalf of the user. This means that all OSF-Drupal queries (except for a few rare exceptions) sent by the instance are authenticated twice:


 * 1) One time to make sure that the OSF-Drupal node (as a proxy) has the rights to perform the requested action
 * 2) Another time to make sure that the user of the OSF-Drupal node has the rights to perform the requested action.

This means that even if a user has the permissions to do a certain action, if he uses a proxy (in this case, a OSF-Drupal node) that doesn't have the rights to perform that action, then the structWSF instance will return an unauthorized error to the user.

This double authentication is to prevent security breaches where a user, or a proxy, pretends to be someone that he is not.

OSF-Drupal Permissions on structWSF
A OSF-Drupal node normally has full rights on a structWSF instance. It is possible that it doesn't, but most of the time it does. These permissions are defined by the administrator of the structWSF instance (see below).

Datasets Registries in OSF-Drupal
A OSF-Drupal node has two kind of datasets:


 * 1) The ones that it creates on a structWSF instance, and
 * 2) The ones that are already created on a structWSF instance, but that get linked to the OSF-Drupal node.

We refer to this distinction for datasets #1 as datasets and to the datasets #2 as linked datasets.

Linked datasets are just a way to aggregate datasets from different structWSF instances into the same OSF-Drupal Web site portal. That way, the data managed by different people and organizations can live in the same OSF-Drupal Web site. More information about linked datasets can be read here.

Now let's discuss how a OSF-Drupal instance manages all of the structWSF instances registered to it, and how it manages the datasets created in, or linked from, these structWSF instances.

structWSF Instances Registry
With respect to access, the first thing OSF-Drupal does is to maintain a registry of all the structWSF instances registered to the instance. The module that is used to administer these instances is called. This module lets OSF-Drupal node administrators to either subscribe or un-subscribe remote structWSF instances to it.

The registry is saved in the  Drupal variable. That registry is a simple array of structWSF base URLs. The registry can easily be accessed by using this Drupal API call:

Datasets Registry
Then we have a registry of datasets that have been created from, or linked to, the OSF-Drupal instance. This registry of datasets is composed of multiple Drupal variables that share the following pattern.

First, each dataset has two variables:



Where  is the ID of the Organic Group that is attached to this dataset.

The variable #1 above holds the provenance of the dataset: that is, the URL of the structWSF where it is hosted.

The variable #2 above holds the URI of the dataset: that is, its unique identifier.

Datasets Permissions in structWSF
Each time a user interacts with OSF-Drupal, a series of queries are sent to one of the structWSF instance registered on the OSF-Drupal instance. These queries get validated by structWSF and things get displayed in the OSF-Drupal user interface (results of the queries, access error messages, processing errors, etc).

The entire validation workflow on structWSF's side is described in the Datasets and Access Rights (structWSF) page.

OSF-Drupal Users On structWsf
This section describes how OSF-Drupal users are handled on a structWSF instance. The basic process is that OSF-Drupal users (proxied users) get authenticated, and then various OSF-Drupal modules interact with structWSF, including specific use rights authentication.

In the Datasets and Access Rights (structWSF) page, we discuss how structWSF authenticate queries directly send by a user. In this section, we extend that basic behavior to show how it authenticates proxied queries.

structWSF Requester(s)
There is a  parameter for most of the structWSF Web service endpoints. This parameter is what is used by any proxy systems (such as OSF-Drupal) to tell the structWSF endpoint that the query is being issued by the specific IP address for the given user.

When a Web service endpoint receives such a query, then it may authenticate the query based on two things (the two points above):


 * 1) The, so to make sure that user has the permissions to perform that action on the structWSF instance
 * 2) The, so to make sure that the proxy system also has the permissions to perform that action on the structWSF instance.

If one of these two IP addresses doesn't have the permissions to perform that action, than the endpoint will return an unauthenticated message.

Remember the OSF-Drupal instances normally have full rights on the registered structWSF networks. These rights are granted by the structWSF administrator(s). If this is the case, then this means that there is a strong trust between the OSF-Drupal and the structWSF administrator(s) (if they are two different persons or organizations).

Overloaded IPs
For structWSF, a user has a single IP address at any given time. But, IP addresses may vary from home to work or other locations. A OSF-Drupal proxy is used to let its users having access to one, or multiple, structWSF instances. However, as is evident, all users of a given OSF-Drupal instance share the same IP address: the one of the OSF-Drupal node's Web server. So, under default conditions, all users of the proxy normally have the same privileges.

It is why the concept of overloading IP addresses has been introduced in structWSF. Overloading an IP address is nothing other than appending some value to it. This overloaded IP then gets used in the  parameter of any structWSF web service. This additional information is normally the ID of a user managed by the requesting proxy service.

An IP address is overloaded by using the  symbol. This symbol is appended at the end of the overloaded IP address. What comes after is the ID. An overloaded IP address looks like. The burden of managing these overloaded IP addresses is put on the shoulders of the proxy service. It is the proxy (in this case OSF-Drupal) that has to manage the linkage between these overloaded IP addresses that get defined on the different structWSF instances and its own internal users ID.

Such an overloaded IP address could be used in an Datasets_and_Access_Rights_(structWSF) like this:

->     ->      ->      ->      ->      ->      ->      ->      ->      ->

The Network Effect
A single OSF-Drupal instance can interact with multiple structWSF instances at the same time. If a OSF-Drupal node manages datasets hosted on, say, 3 different structWSF instances, then each time a Browse, Search, etc., page gets loaded, each of these structWSF instance will be queried to check what data is accessible by the requesting user. This is where the network effect of the structWSF design kicks in (see further the Distributed Networks with structWSF document).

This distributed ability is possible because all capabilities of the network are funneled through the various Web service endpoint queries. This characteristic is quite powerful. Via this design, for example, big datasets may be managed on their own structWSF instances, structWSF could host datasets specific to a certain domain, etc.

SID (server ID)
Each structWSF instance has its own SID (server ID). This server ID is used by OSF-Drupal to make sure that the same structWSF instance is not being queried multiple times with the same query. Theoretically, the same structWSF instance could answer to queries that are sent to different domains such as localhost, my-domain-1.com, my-domain-2.com, etc.

This usecase can happen if a user creates new datasets, or links to existing datasets, by using these different domains, that refers to the same structWSF instance.

SID on structWSF
The SID is created by the structWSF instance. The SID file is created by the root  structWSF file. A SID is really just a unique identifier string created from the MD5 string of the current microtime. The SID directory is specified by the  variable of the index.php root file.

SID on OSF-Drupal
The SID registry is saved in a Drupal variable. The variable is called SID-Registry. It can be accessed, within Drupal, by using this API call:

This will return an array with all of the URLs from which you can access a given unique structWSF instance (SID).