Project 1: Dokuwiki

This project is to deliver a Terraform configuration and an Ansible playbook that creates a DokuWiki installation on GCP. Your code will be deployed on an empty GCP project and after it runs a client will be able to visit their new Wiki using an IP address (we’ll do host names later in the semester).

There are many reasons why a person or a team would want an instance of Dokuwiki. One real world example is my colleague Jeff Bergamini’s website. Dokuwiki is a very quick and effective way to publish your own content with the ability to limit access to registered users and extend functionality with plugins. Functions like this can be sold in the Google Cloud Marketplace or you could make your own business reselling hosted Dokuwiki to customers.

Getting Started

You should do all of your work in the cis-91 git repository. The base configuration is the best way to start. Make a copy in your repository:

$ mkdir dokuwiki
$ cp -r base dokuwiki/

Use Git

You should be committing and pushing your code regularly. When you get something working –even a small thing– stop, commit and push it. That will give you a history of changes that is useful for when you want to restore something that was lost. If you want to ask me about your code over email or text it’ll be the best if I can see it in your repo on GitHub.

About Dokuwiki

Dokuwiki is a simple Wiki platform written in PHP that doesn’t require a database. The database non-requirement is significant because managing a DBMS adds considerable complexity to a system. Dokuwiki uses a simple file structure located on the web server. The trade-off is that while Dokuwiki is simple to setup it won’t scale beyond a single instance.

Installation of Dokuwiki is simple, just get the latest version from this link,

and simply untar it into the /var/www/html directory of your web server (Hint: use Ansible’s unarchive to do that). Keep in mind that the www-data user must have write permission to parts of the untarred structure. In order to run on an Ubuntu VM these packages have to be installed:

  1. apache2

  2. php

  3. php-xml

Application Structure

The picture shows a simplified summary of the resources in the project:

Application overview

Compute

The deployment is based on a single instance. The instance size should be configurable to accommodate wikis with different traffic requirements (or to scale as needed). You can run any version of Linux as you like on the instance and adjust your playbooks accordingly.

Storage

The VM should have two persistent disks attached.

Disk Mount Point Notes
system / The default system disk for your distribution.
data /var/www Dokuwiki data

The data disk can be formatted with any filesystem you like (a good default is ext4).

Backups

The deployment should create a Cloud Storage Bucket. Once an hour the instance should make a tar file from the contents of /var/www and copy it to the bucket. That’s a very frequent backup! In order to conserve space the bucket should be set to keep files for six months from their creation. Files older than six months should be deleted automatically.

Here’s a shell script that will perform the backup and upload it to a bucket:

# Put this in /etc/cron.hourly/backup
TARGET="gs://your-bucket-id"

tar_file=/tmp/dokuwiki-backup-$(date +%s).tar.gz
tar -czf $tar_file /var/www/html 2>/dev/null 
gsutil cp $tar_file $TARGET
rm -f $tar_file

Making that script executable in /etc/cron.hourly/backup will cause it to be run every hour. The script creates a timestamp for each file so that every new run will create a new file.

IAM

Create a service account and assign it to the instance. The gcloud and gsutil commands will not work on the instance without an assigned service account. The service account has to be allowed to write to cloud storage buckets. The role roles/storage.objectAdmin gives the instance suitable access to run any gsutil command. Consider limiting access to make the application more secure.

Monitoring

The application should be monitored for uptime. You should know about application downtime before your customers do (at least at the same time). When you’re in business you’ll have to guarantee a certain percentage of uptime (e.g. 99.9%) and should know if you’re on target to meet your guarantees. Monitor your application with an uptime check every minute by loading the /doku.php page.

Logging

The application runs on a single VM. You need visibility into the VM so you can see if the VM is properly sized (not too big, not too small) and that there are no errors on the VM. Install the Cloud Logging agent so that the VM reports its logging data to the Logging tool.

Firewall

The firewall should enable:

  1. SSH (port 22) so you can connect to the intance

  2. HTTP (port 80) so the user has access to the wiki

No other ports should be available.

Other Requirements

  1. The Terraform configuration should have variables for:

    1. The GCP project

    2. Region (default: us-central1)

    3. Zone (default: us-central1-c)

    4. Instance type (default: e2-micro)

  2. Output the IP address from Terraform

  3. Use the latest version of Dokuwiki