Proxying Virtuoso Via Apache2

Introduction
Sometimes it may not be possible to access the Virtuoso back-end (such as its Conductor or SPARQL user interface) via its default port 8890. There exists setups where you only have the HTTP and SSH ports open, and that you can't do otherwise. However, you have to have Apache2 running on port 80, and you have to have Virtuoso running on another port (let's say: 8890). In that case, you have to proxy all Virtuoso queries via Apache2 using mod-proxy and mod-proxy-html.

For this tutorial, we consider that all basic Linux distributions have installation packages for Apache2 that includes mod-proxy by default. If it is not the case, you will have to install mod-proxy first.

This tutorial is about installing mod-proxy-html only, and to configure mod-proxy and mod-proxy-html for proxying Virtuoso queries via Apache2.

Installing Mod-Proxy-Html on Ubuntu
Installing mod-proxy-html on Ubuntu is as simple as running this apt-get installation command:

Installing Mod-Proxy-Html on any Linux Distribution
In this case, we have a Linux distribution which has a pre-packaged version of Apache2 that comes with mod-proxy, but that doesn't come with mod-proxy-html. It is the case with CentOS 5 for example.

In that case, you have no other choices than downloading the source, and to compile it.

First, go to the temp folder of your distribution:

Then download the mod-proxy-html source codes:

Then unzip the source:

Then you have to make sure you have the libxml2 development toolkit installed on your system. The library is installed on any Linux distribution, but not the developer toolkit. Lets consider that you have access to the libxml2-devel RPM package and YUM:

Then the next step is to make sure that you have the apache2 developer toolkit installed on your server as well. We need the apxs software to compile the module. Lets install it:

Then the next step is to compile the mod-proxy-html and the mod-xml2enc modules:

The apxs application will compile the module and will install it into the modules folder of your Apache2 server.

Configuring Apache2 For Proxying Virtuoso Queries
Now is the time to configure Apache2 to proxy Virtuoso queries.

First, make sure that all these modules get loaded in that order:

 LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_balancer_module modules/mod_proxy_balancer.so LoadModule proxy_ftp_module modules/mod_proxy_ftp.so LoadModule proxy_http_module modules/mod_proxy_http.so LoadModule proxy_connect_module modules/mod_proxy_connect.so LoadFile  /usr/lib/libxml2.so LoadModule proxy_html_module modules/mod_proxy_html.so LoadModule xml2enc_module modules/mod_xml2enc.so LoadModule headers_module    modules/mod_headers.so

Now, let's configure the proxy in your Apache2 configuration file:

 ProxyRequests      Off
 * 1)  Disable global proxy

ProxyPreserveHost  On
 * 1)  Pass original host to Virtuoso

ProxyTimeout       300
 * 1)  Timeout waiting for Virtuoso

 Order deny,allow Allow from all 
 * 1)  Set permission


 * 1) ProxyHTMLLogVerbose On
 * 2) LogLevel Info
 * 3) LogLevel Debug

ProxyHTMLLinks a               href ProxyHTMLLinks area            href ProxyHTMLLinks link            href ProxyHTMLLinks img             src longdesc usemap ProxyHTMLLinks object          classid codebase data usemap ProxyHTMLLinks q               cite ProxyHTMLLinks blockquote      cite ProxyHTMLLinks ins             cite ProxyHTMLLinks del             cite ProxyHTMLLinks form            action ProxyHTMLLinks input           src usemap ProxyHTMLLinks head            profile ProxyHTMLLinks base            href ProxyHTMLLinks script          src for
 * 1) Configuring the links that will be re-written

ProxyHTMLEvents onclick ondblclick onmousedown onmouseup \ onmouseover onmousemove onmouseout onkeypress \ onkeydown onkeyup onfocus onblur onload \ onunload onsubmit onreset onselect onchange

 ProxyPass              http://localhost:8890/ ProxyPassReverse       / ProxyHTMLEnable        On   ProxyHTMLURLMap         / /virtuoso/ ProxyHTMLURLMap        http://localhost:8890/ /virtuoso/ 
 * 1) Defining the URLs to proxy

 ProxyPass              http://localhost:8983/solr/ ProxyPassReverse       / ProxyHTMLEnable        On   ProxyHTMLURLMap         / /solr/ ProxyHTMLURLMap        http://localhost:8983/ /solr/ 

What it does is to forward any queries has a URL of the form: " http://mydomain.com/virtuoso/ ", into a query to the virtuoso server located at: " http://localhost:8890/ ". Then it will return the results to the user.

What is important here is the mod-proxy-html module. What it does is to make sure that if you click on a link, within the web page, that this links refers to the same server as well. So, what this module does is really to rewrite the links within the HTML web page that is proxied such that they are consistent according to the ProxyHTMLURLMap rules.