Setting-up GENODESY

infrastructure
Sharing experience regarding the creation of the GENODESY website and related web services
Author

Patrice Godard

Published

November 2, 2022

Website construction illustration adapted from https://commons.wikimedia.org/wiki/File:Build-website.jpg

After having bought the domain name a few years ago, I’ve finally decided to setup the GENODESY website and related resources. Here, I share my experience regarding this process: the tools and services I relied on and the design and technical choices I made.

My intend was to setup, not only a website and a blog, but also some services such as Shiny applications, Neo4j and ClickHouse databases. That’s why I chose to rent a dedicated server that I configured and that I’m administrating myself. I spent time in tuning the configuration of the different services in order to make them as accessible as possible to the users:

1 Hosting the services

There are many companies providing capabilities for hosting web pages and web services. Here, I chose OVHCloud, a french cloud computing company. I went for the first option in the kimsufi range of dedicated servers:

  • CPU: Intel Xeon E3-1245v2 - 4c/8t - 3.4 GHz/3.8 GHz
  • Memory: 32 GB DDR3
  • Storage: 3 x 2 TB HDD SATA Soft RAID
  • OS: Ubuntu Server 22.04 LTS “Jammy Jellyfish”

I also bought the genodesy domain names (.org, .com, .net, .info and .biz) to OVHCloud and it was easy to map genodesy.org to the server IP address in the DNS (Domain Name System) using their configuration tool.

2 Web server and reverse proxy

2.1 NGINX configuration

I chose the NGINX web server to make the website available. NGINX can also be used as a reverse proxy, and I took advantage of this feature to provide other services.

I followed the following tutorials to get familiar with NGINX configuration:

I configured the genodesy.org server block to setup the following behaviors:

  • redirect http (80) queries to the https (443) port
  • provide services through the https (443) port using a valid SSL (Secure Sockets Layer) certificate
  • proxy pass queries to specific path to services using different ports (e.g. shiny or neo4j)

Here is the server block configuration file I’ve finally written:

/etc/nginx/sites-available/genodesy.org

# Queries to the http (80) port are redirected to the https (443) port
server {
    listen 80 default_server;
    listen [::]:80 default_server;
    server_name genodesy.org www.genodesy.org;
    
    location / {
        return 301 https://$server_name$request_uri;
    }
    
    root /var/www/genodesy.org/html;
    index index.html;
    
    # The following folder is used to verify the domain of issued SSL
    # certificate and therefore is still accessible through the http (80) port.
    location /.well-known/pki-validation/ {
        try_files $uri $uri/ =404;
    }
}

# Services are provided through the https (443) port
server {
        listen 443 ssl http2 default_server;
        listen [::]:443 ssl http2 default_server;
        
        # The files included below are used for SSL configuration
        include snippets/ssl-for-free.conf;
        include snippets/ssl-params.conf;


        root /var/www/genodesy.org/html;
        index index.html index.htm index.nginx-debian.html;
        
        error_page 404 /404.html;

        server_name genodesy.org www.genodesy.org;

        # The main web server
        location / {
                try_files $uri $uri/ =404;
        }
        
        # Reverse proxy to the BED Neo4j database hosted on port 5454
        location /BED {
            rewrite ^/BED/?(.*)$ /$1 break;
            proxy_pass http://genodesy.org:5454/browser;
            proxy_redirect / $scheme://$http_host/BED;
        }
        
        # Reverse proxy to Shiny hosted on port 3838
        rewrite ^/shiny$ $scheme://$http_host/shiny/ permanent;
        location /shiny/ {
            rewrite ^/shiny/(.*)$ /$1 break;
            proxy_pass http://genodesy.org:3838;
            proxy_redirect / $scheme://$http_host/shiny/;
        }
        
}

2.2 SSL certificate

I created a free valid SSL certificate on SSL For Free.

Certificate chain file in NGINX

In NGINX, the ca_bundle.crt certificate chain file must be appended to the certificate.crt file to make it valid and allow the verification of the domain by ZeroSSL. (tip found here)

2.3 Firewall

The Uncomplicated Firewall (UFW) was used to allow connections only to specific ports supporting protocols of interest: ssh (22), http (80) and https (443).

sudo apt-get install ufw
sudo ufw enable
sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

3 Creating and managing the website

3.1 Implementation

As an R user and a big fan of tools developed by the Posit PBC team, I’m relying on Quarto to create and manage the content of this website.

There are many documentation resources regarding the Quarto publishing system, starting by the extensive official guide. Quarto has a lot of features and I don’t intend (to even try) to list all of them here. Instead, here are a few points on which I put particular attention or that I found particularly handy:

3.2 Content organization

Currently, the GENODESY website is organized in the following four main parts. As this site is pretty new, it will hopefully grow and evolve.

  • Home: this part is empty for the moment. My intend is to point to the different contents and resources offered on this site.

  • Services: applications, databases or APIs made available on this site.

  • Author: who I am and what I’ve done.

  • Blog: standalone topics addressing more or less technical issues or experiences. I moved this blog on my personal website which, I think, is more relevant.

4 Other services

One of the advantages of using a reverse proxy such as NGINX is that we can focus on services installation without having to worry too much about how to secure their access. I’ve shown above how I configured NGINX to provide an access to a Neo4j API and to Shiny applications through SSL encryption. The paragraphs below explain how the services themselves were installed and can be used.

4.1 BED Neo4j database

BED (Biological Entity Dictionary) is an R package to get and explore mapping between identifiers of biological entities (BE). It relies on a Neo4j database in which the relationships between the identifiers from different sources are recorded. I made an instance of the BED Neo4j database available on this server to allow people to easily test the package and its capabilities.

The BED Neo4j database was installed as docker container, using the S05-BED-Container.sh script.

The access to the 5454 port is denied by the firewall, but NGINX provides a secure access to this resource via this path: https://genodesy.org/BED/. Therefore, external users can connect to the BED database in R as follows:

library(BED)
connectToBed(url="https://genodesy.org/BED/")
https://genodesy.org/BED/
BED
UCB-Human
2024.01.14
Cache OFF

4.2 Shiny server

Shiny is an R package to build interactive web apps straight from R. These apps can be easily deployed on services such as shinyapps.io. It is also possible to host them within a Shiny Server instance.

The Shiny Server was simply installed following the instructions provided in the official administrator’s guide.

Again, the access to the 5454 port is denied by the firewall, but NGINX provides a secure access to this resource via this path: https://genodesy.org/shiny/. Here is an example of a Shiny app hosted on genodesy.org : https://genodesy.org/shiny/ReDaMoR.

5 License

I chose to release the content of this website under a Creative Commons Attribution-ShareAlike 4.0 International License.