Proxying Virtuoso Via Apache2

From OSF Wiki
Jump to: navigation, search

Introduction

Sometimes it may not be possible to access the Virtuoso back-end (such as its Conductor or SPARQL user interface) via its default port 8890. There exists setups where you only have the HTTP and SSH ports open, and that you can't do otherwise. However, you have to have Apache2 running on port 80, and you have to have Virtuoso running on another port (let's say: 8890). In that case, you have to proxy all Virtuoso queries via Apache2 using mod-proxy and mod-proxy-html.

For this tutorial, we consider that all basic Linux distributions have installation packages for Apache2 that includes mod-proxy by default. If it is not the case, you will have to install mod-proxy first.

This tutorial is about installing mod-proxy-html only, and to configure mod-proxy and mod-proxy-html for proxying Virtuoso queries via Apache2.

Installing Mod-Proxy-Html on Ubuntu

Installing mod-proxy-html on Ubuntu is as simple as running this apt-get installation command:

apt-get install libapache2-mod-proxy-html

Installing Mod-Proxy-Html on any Linux Distribution

In this case, we have a Linux distribution which has a pre-packaged version of Apache2 that comes with mod-proxy, but that doesn't come with mod-proxy-html. It is the case with CentOS 5 for example.

In that case, you have no other choices than downloading the source, and to compile it.

First, go to the temp folder of your distribution:

cd /tmp

Then download the mod-proxy-html source codes:

wget http://apache.webthing.com/mod_proxy_html/mod_proxy_html.zip

Then unzip the source:

unzip mod_proxy_html.zip cd mod_proxy_html

Then you have to make sure you have the libxml2 development toolkit installed on your system. The library is installed on any Linux distribution, but not the developer toolkit. Lets consider that you have access to the libxml2-devel RPM package and YUM:

yum install libxml2-devel

Then the next step is to make sure that you have the apache2 developer toolkit installed on your server as well. We need the apxs software to compile the module. Lets install it:

yum install httpd-devel

Then the next step is to compile the mod-proxy-html and the mod-xml2enc modules:

apxs -c -I/usr/include/libxml2 -I. -i mod_proxy_html.c apxs -c -I/usr/include/libxml2 -I. -i mod_xml2enc.c

The apxs application will compile the module and will install it into the modules folder of your Apache2 server.

Configuring Apache2 For Proxying Virtuoso Queries

Now is the time to configure Apache2 to proxy Virtuoso queries.

First, make sure that all these modules get loaded in that order:

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so
LoadFile   /usr/lib/libxml2.so
LoadModule proxy_html_module modules/mod_proxy_html.so
LoadModule xml2enc_module modules/mod_xml2enc.so
LoadModule headers_module    modules/mod_headers.so

Now, let's configure the proxy in your Apache2 configuration file:

#  Disable global proxy
ProxyRequests       Off

#  Pass original host to Virtuoso
ProxyPreserveHost   On

#  Timeout waiting for Virtuoso
ProxyTimeout        300

#  Set permission
<Proxy *>
   Order deny,allow
   Allow from all
</Proxy>

#ProxyHTMLLogVerbose On
#LogLevel Info
#LogLevel Debug

# Configuring the links that will be re-written
ProxyHTMLLinks  a               href
ProxyHTMLLinks  area            href
ProxyHTMLLinks  link            href
ProxyHTMLLinks  img             src longdesc usemap
ProxyHTMLLinks  object          classid codebase data usemap
ProxyHTMLLinks  q               cite
ProxyHTMLLinks  blockquote      cite
ProxyHTMLLinks  ins             cite
ProxyHTMLLinks  del             cite
ProxyHTMLLinks  form            action
ProxyHTMLLinks  input           src usemap
ProxyHTMLLinks  head            profile
ProxyHTMLLinks  base            href
ProxyHTMLLinks  script          src for

ProxyHTMLEvents onclick ondblclick onmousedown onmouseup \
                onmouseover onmousemove onmouseout onkeypress \
                onkeydown onkeyup onfocus onblur onload \
                onunload onsubmit onreset onselect onchange


# Defining the URLs to proxy
<Location /virtuoso/>
   ProxyPass               http://localhost:8890/
   ProxyPassReverse        /
   ProxyHTMLEnable         On
   ProxyHTMLURLMap         / /virtuoso/
   ProxyHTMLURLMap         http://localhost:8890/ /virtuoso/
</Location>

<Location /solr/>
   ProxyPass               http://localhost:8983/solr/
   ProxyPassReverse        /
   ProxyHTMLEnable         On
   ProxyHTMLURLMap         / /solr/
   ProxyHTMLURLMap         http://localhost:8983/ /solr/
</Location>


What it does is to forward any queries has a URL of the form: "http://mydomain.com/virtuoso/", into a query to the virtuoso server located at: "http://localhost:8890/". Then it will return the results to the user.

What is important here is the mod-proxy-html module. What it does is to make sure that if you click on a link, within the web page, that this links refers to the same server as well. So, what this module does is really to rewrite the links within the HTML web page that is proxied such that they are consistent according to the ProxyHTMLURLMap rules.

References