Modifying Search Results via Terms Boosting

The Search endpoint does support a series of parameter that enable users to boost the terms of their search query to modify the relevancy of the returned results. The main characteristic of OSF Web Service's search endpoint is that the structure of the data can be used to help getting more relevant results from a full text search query.

There are three main areas that can be influenced to change the scoring of returned search results:


 * 1) Record's types boosting
 * 2) Record's datasets provenance boosting
 * 3) Record's attribute and/or attribute/value boosting

Boosting the weight of a type, a dataset or an attribute/value only affect the score of each result. This doesn't determine if a result is returned or not. What does determine if a result will be added to the returned results is the full text search query and the filters defined for that query.

Using the Data Structure To Improve Relevancy
In OSF Web Service, add the data is in RDF. This means that all the content is fully structured, and that a long series of attributes and values have been used to describe the records that have been indexed in the system, all with their own semantic.

All these characteristics of the data can be leveraged to influence the scoring of the results for a filtered full text search query.

In the following tutorial, we will consider that we have the following set of data accessible via the Search endpoint:


 * 1) A series of datasets with information about people and organizations
 * 2) A series of ontologies that define thousands of concepts specific to a domain (healthcare in this example)
 * 3) A series of datasets with documents records (healthcare related documents). Each of these records have been related to domain concepts using OSF Tagger (scones). There exists a hierarchy of documents types, and all the documents are related to people and organizations from the other datasets.

As you can see, the OSF Web Service instance of that Search endpoint is rich of fully structured data.

One thing to note is that the context of a search query will greatly influence how the boosting techniques will be used. For example, if I send a search query that will be used to display a list of relevant articles for a page that talks about pregnant womens, it will be quite different than if I do the same but that talks about breast cancer.

Now, let's take a few examples of how this structure can be used to help improving the relevancy of the returned results.

Some Real Boosting Examples
In this article we will use the Web Service-PHP-API OSF Web Service PHP API to generate the search queries that will be sent to the Search endpoint.

Results
Here is the top 3 result for the query defined above. There were 22685 results, and these are the 3 most relevant according to this query:

Results
Here is the top 3 result for the query defined above. There were 22685 results, and these are the 3 most relevant according to this query:

Sorting and Boosting
What boosting does, is to modifying the scoring of each result, within a set a results. From there, if you want the results with the highest score results at the top, or the bottom, of the returned list of results, you have to sort the resultset by  in ascending and descending order.