Building a continuous integration workflow with gitlab and openshift 3

In this post I’ll go over building and testing a Docker image with gitlab CI and then pushing that image to Openshift 3. It should be somewhat helpful for people using other Docker solutions like Kubernetes too or CI solutions like Jenkins. I’m using Django for the project with some front end assets built in node.

Our goal is to have one docker environment used in development, CI, staging, and production. We’ll avoid repeating ourselves with image building.

At a high level my workflow looks like

Screenshot from 2016-04-01 15-12-12

Local development

All local development happens with docker compose. There is plenty of info on the matter so I’ll skip most of this. I will point out that I want to use the same python based docker image for development and later in production.

Continuous Integration

I’m using gitlab and gitlab CI runner to do testing and build a docker image. Gitlab has some docs on how to build a docker image. The choices are shell and docker-in-docker. I found docker-in-docker to be slow, complex, and error prone. In theory it would be better since the environments are more isolated.

Gitlab CI building images with shell executor

Here is my full .gitlab-ci.yml file for reference.

The goal here is to build the image, run unit tests, deploy on success, and always clean up. I had to run gitlab ci runner as root otherwise I would get permission errors. 🙁

In a non trivial CI system, shell can get messy too. We need to be concerned about building too many images and filling all disk space, exhausting the number of docker network subnet pools, and ensuring concurrency works (if you need that).

Disk space – I suggest using a service that lets you attach a large volume that is formatted with an lot of inodes and using Docker’s overlayfs storage engine. I used AWS’s EC2 with a 120gb mount for /var/docker. See this blog post for details. Pay attention to the part where you define inodes. I went with 16568256.

Docker clean up – Gitlab has a docker image that can help clean up docker images and containers for you here. I’d also consider restarting the server at night and running your own clean up scripts too. I also place a CI cleanup stage like

  - build
  - test
  - deploy
  - clean

  COMPOSE: docker-compose -p myproject$CI_BUILD_REF

  stage: clean
  when: always
    - $COMPOSE stop
    - $COMPOSE down
    - $COMPOSE rm -f

I’ve been using docker since 1.0 and I’m still always amazed by how it finds new ways of breaking itself. You may need to add your own hacks to seek and destroy docker images and containers that will want to build up forever.

The $CI_BUILD_REF is to ensure each docker image is unique – this allows us to run multiple builds and have some certainty the image being tested is the one being pushed to docker hub.

The test stages are rather django/node specific. Just place whatever code needs to execute to run tests here. If it gets a success exit code gitlab CI will know it passed.

Pushing to docker hub – I’m tagging my tested image, pushing it to docker hub, and running a webhook to notify openshift to automatically pull the image and deploy it to staging.

  stage: deploy
   - qa
   - echo "Tag and push ${PROJECT_NAME}_web"
    - docker tag ${PROJECT_NAME}_web ${IMAGE_NAME}qa
    - docker push ${IMAGE_NAME}:qa
    - "./bin/ qa $DEPLOY_WEBHOOK_STAGING"

Notice how I’m only running this on the qa branch and that I’m tagging the image as “qa”. I’m using docker tags so that I can have one image that has different development stages – dev, staging, and production.

Openshift with docker build strategy

Openshift lets you build using a source to image strategy or docker. Source to image would mean rebuilding a docker image – which we already did in CI. So let’s not use that. The docker strategy was a bit confusing to me however. I ended up having a very minimal build stage using Openshift’s docker build strategy. Here is a snippet from my build yaml.

    type: Dockerfile
    dockerfile: "FROM thelab/tsi-cocoon:dev\nRUN ./ collectstatic --noinput"
    type: Docker
        kind: DockerImage
        name: ''
        name: dockerhub
      forcePull: true

Notice the source type Dockerfile with a VERY minimal inline dockerfile that just gets the right image and collects static (Django specific – this could really just be FROM image-name)

The strategy is set to type: Docker and includes my docker image and the “secret” needed to pull the image from my private repo. Note that if you must specify the full docker registry ( or else it will not work with a private registry. You need to add the secret using oc secrets new dockerhub .dockercfg=dockercfg where dockercfg is the file that might be under ~/.dockercfg.

forcePull is set to true so that openshift does a docker pull each time.

You’ll need to define deployment, services, ect in openshift – but I include that in the scope of this post. I switched a source to image build to docker based without having to touch anything else.

That’s it – the same docker image you used with compose locally should be on openshift. I set up a workflow where git commits on specific branches automatically deploy on openshift staging environments. Then I manually trigger the production deploy using the same image as staging.

Extending Openshift’s source to image

Openshift 3 introduces a new concept “source to image” or s2i. It’s a way to create a Docker image out of some source code and a Docker image – for example a python s2i image. It makes Openshift 3’s docker based workflow feel more like Openshift 2 or Heroku.

Image from

One problem I ran into using s2i-python was a lack of binary packages installed that are needed to build things like Pillow. For this we need to extend the base image to add what we want. You can see the end result on github.

Let’s review the Dockerfile I build. We start by extending centos’s s2i image. Note I’m using Python 3.4. You can review the upstream project here to see what I’m extending.

FROM centos/python-34-centos7

Next I’m mimicking how centos installs packages from the upstream file. Upstream and my version

You can see I installed postgresql client tools and node too which I need. I also install a couple pip packages I know I’ll need on all my python projects – this is done just to speed up the build time. Do as much or as little as you want.

Next I’ll post it on Docker hub with automated builds. See here. I added centos/python-34-centos7 as a linked repository. This way my docker image builds any time the upstream image builds too – ensuring I get security updates.

Screenshot from 2016-03-02 15-40-55

Finally I just use the image as I would the original s2i python in openshift. I can add it as I would any image with oc import-image s2i-python --from="thelab/s2i-python:latest" --confirm

Next post I’ll describe how to reuse a customized s2i python image on local development with Docker compose.

Finding near locations with GeoDjango and Postgis Part II

This is a continuation of part I.

Converting phrases into coordinates

Let’s test out Nominatim – the Open Street Maps tool to search their map data. We basically want whatever the user types in to turn into map coordinates.

from geopy.geocoders import Nominatim
from django.contrib.gis.geos import Point

def words_to_point(q):
    geolocator = Nominatim()
    location = geolocator.geocode(q)
    point = Point(location.longitude, location.latitude)
    return point

We can take vague statements like New York and get the coordinates of New York City. So easy!

Putting things together

I’ll use Django Rest Framework to make an api where we can query this data and return json. I’m leaving out some scaffolding code – so make sure you are familiar with Django Rest Framework first – or just make your own view without it.

class LocationViewSet(mixins.ListModelMixin, viewsets.GenericViewSet):
    """ Searchable list of locations

    GET params:

    - q - Location search paramater (New York or 10001)
    - distance - Distance away. Defaults to 50
    serializer_class = LocationSerializer
    queryset = Location.objects.all()

    def get_queryset(self, *args, **kwargs):
        queryset = self.queryset
        q = self.request.GET.get('q')
        distance = self.request.GET.get('distance', 50)
        if q:
            point = words_to_point(q)
            distance = D(mi=distance)
            queryset = queryset.filter(point__distance_lte=(point, distance))
        return queryset

My central park location shows when I search for New York, but not London. Cool.
Screenshot from 2015-10-02 14-55-45

At this point we have a usable backend. Next steps would be to pick a client side solution to present this data. I might post more on that later.

Finding near locations with GeoDjango and Postgis Part I

With GeoDjango we can find places in proximity to other places – this is very useful for things like a store locator. Let’s use a store locater as an example. Our store locator needs to be able to read in messy user input (zip, address, city, some combination). Then, locate any stores we have nearby.

General concept and theory

Screenshot from 2015-10-01 17-16-31

We have two problems to solve. One is to turn messy address input into a point on the globe. Then we need a way to query this point against other known points and determine which locations are close.

Set up known locations

Before we can really begin we need to set up GeoDjango. You can read the docs or use docker-compose.

It’s still a good idea to read the tutorial even if you use docker.

Let’s add a location. Something like:

class Location(models.Model):
    name = models.CharField(max_length=70)
    point = models.PointField()

    objects = models.GeoManager()

A PointField stores a point on the map. Because Earth is not flat we can’t use simple X, Y coordinates. Luckily you can almost think of Latitude and Longitude as X, Y. GeoDjango defaults to this. It’s also easy to get Latitude and Longitude from places like Google Maps. So if we want – we can ignore the complexities of mapping coordinates on Earth. Or you can read up on SRID if you want to learn more.

At this point we can start creating locations with points – but for ease of use add GeoModelAdmin to Django Admin to use Open Street Maps to set points.

from django.contrib import admin
from django.contrib.gis.admin import GeoModelAdmin
from .models import Location

class LocationAdmin(GeoModelAdmin):

Screenshot from 2015-10-01 17-31-54

Wow! We’re doing GIS!

Add a few locations. If you want to get their coordinates just type location.point.x (or y).

Querying for distance.

Django has some docs for this. Basically make a new point. Then query distance. Like this:

from django.contrib.gis.geos import fromstr
from django.contrib.gis.measure import D
from .models import Location

geom = fromstr('POINT(-73 40)')
Location.objects.filter(point__distance_lte=(geom, D(m=10000)))

m is meters – you can pass all sorts of things though. The result should be a queryset of Locations that are near our “geom” location.

Already we can find locations near other locations or arbitrary points! In Part II I’ll explain how to use Open Street Maps to turn a fuzzy query like “New York” into a point. And from there we can make a store locator!

Review of Tutanota

Tutanota is an open source email provider. It features easy to use end to end encryption. It’s notable as a modern, libre, and cheap hosted email provider.

Why not gmail

Gmail is easily the best email provider. It’s light years ahead of any open source system. It’s gratis for individual, education, nonprofit, and small business use. That makes it hard to compete with. However it’s also a giant proprietary service directing a huge portion of your communication to the world. As a free software advocate it’s hard to feel good about using Google services – especially ones as critical as communication. Google also took concerning steps to remove interoperability in it’s chat service by removing XMPP in the Google Hangouts.

In looking at alternatives I want something hosted (I do not want to deal with tech problems at home for something as important as email). I’d like something cheap – because even cheap is more expensive that Gmail. I also want something modern looking and easy to use.

The Good

Tutanota’s big feature is security. It’s not hosted in the United States. You can encrypt emails by clicking a button. It doesn’t sell your information to ad networks. It doesn’t leak your personal information when such ad networks or government agencies get hacked.

Tutanota has a modern web interface with native mobile apps (web wrapper, but acceptable). It’s minimalist but that isn’t necessarily bad.
Screenshot from 2015-09-27 15-29-41

Tutanota is easy to use. You can use their hosted version if you trust them and don’t want to host yourself. Setting up a custom domain was pretty easy. Aliases are no problem. It supports multiple users too (but I haven’t tested this out yet).

It’s cheap at 1€ per month per user for the premium account. It also has a free tier if you can go without a custom domain.

The Bad

Tutanota doesn’t support IMAP. That’s pretty annoying. I’d be highly annoyed if there was any decent open source email client – but as there is not I can forgive it. IMAP wouldn’t work with their security model. I wish they had it as an option with a big warning about security.

It lacks a lot of features you expect in email. As I said Gmail is way ahead here. No magic category filtering. No filtering rules. No contact book integration. Not even keyboard shortcuts.

Tutanota only recently open sourced it’s code. It needs to do more to promote a developer community. A public issue tracker would be nice. So would a contribution guide. Right now they only offer a user voice page that is more consumer focused. I left my feedback.


Tutanota is a great option if you want an open source email provider – but only if your feature requirements are minimal. It has great potential as they add more features. The world needs an easy way to send private messages and right now such privacy is a luxury for those very few who understand PGP encryption and can set up services themselves. I applaud anyone trying to make this easier.

Building an api for django activity stream with Generic Foreign Keys

I wanted to build a django-rest-framework api for interacting with django-activity-stream. Activity stream uses Generic Foreign Keys heavily which aren’t naturally supported. We can however reuse existing serializers and nest the data conditionally.

Here is a ModelSerializer for activity steam’s Action model.

from rest_framework import serializers
from actstream.models import Action
from myapp.models import ThingA, ThingB
from myapp.serializers import ThingASerializer, ThingBSerializer

class GenericRelatedField(serializers.Field):
    def to_representation(self, value):
        if isinstance(value, ThingA):
            return ThingASerializer(value).data
        if isinstance(value, ThingB):
            return ThingBSerializer(value).data
        # Not found - return string.
        return str(value)

class ActionSerializer(serializers.ModelSerializer):
    actor = GenericRelatedField(read_only=True)
    target = GenericRelatedField(read_only=True)
    action_object = GenericRelatedField(read_only=True)

    class Meta:
        model = Action

GenericRelatedField will check if the value is an instance of a known Model and assign it the appropriate serializer.

Next we can use a viewset for displaying Actions. Since activity stream uses querysets it’s pretty simple to integrate with a ModelViewSet. In my case I’m checking for a get parameter to determine whether we want all actions, actions of people the logged in user follows, or actions of the user. I added some filters on action and target content type too.

from rest_framework import viewsets
from actstream.models import user_stream, Action
from .serializers import ActionSerializer

class ActivityViewSet(viewsets.ReadOnlyModelViewSet):
    serializer_class = ActionSerializer

    def get_queryset(self):
        following = self.request.GET.get('following')
        if following and following != 'false' and following != '0':
            if following == 'myself':
                qs = user_stream(self.request.user, with_user_activity=True)
                return qs.filter(
            else:  # Everyone else but me
                return user_stream(self.request.user)
        return Action.objects.all()

    filter_fields = (
        'actor_content_type', 'actor_content_type__model',
        'target_content_type', 'target_content_type__model',

Here’s the end result, lots of nested data.
Screenshot from 2015-07-08 17:44:59

Open source chat

Web based chatting services like Hipchat and Slack are catching on these days. Such services are proprietary which severely limits your freedom to extend them and gives control over communication to a for profit company and probably the NSA. What free alternatives are out there? We have good old IRC and XMPP servers. But these are typically a no go for non technical folks and time consuming to set up. Let’s review web based offerings starting with my favorite.

Let’s Chat

Screenshot from 2015-06-03 23:13:31
Let’s chat is based on node and mongodb. It features a nice web interface and some very basic xmpp support. It is not federated sadly so you can’t talk with xmpp users outside your server. It has no hosted offering. Installing it was the easiest of the options I looked at yet very buggy and annoying. This is why Slack is winning folks. Anyway here’s how I ran it.

  • Install from git. Do not use docker due to this bug.
  • Enable xmpp in the settings. You must set the host setting due to this bug. Here is my settings. I also had to enable TLS to get xmpp to work.
  • I use the hubot connector for fun stuff like animated gifs and scripting – which makes it more comparable to slack which offers a lot of integrations.
  • For Android I use Conversations which is an amazing xmpp client.

While I really hated installing Let’s Chat due to all the bugs – I’m happy with it now that it’s running. While I speak a bit badly about all the bugs I really appreciate the work put into it.


Kaiwa is a web front end for an XMPP server. The huge benefit of this is you get a full XMPP server like Prosody which supports many of the more modern XMPP conventions (XEP’s). Federation is REALLY amazing and awesome except you will never use it because no one uses XMPP. In theory it would let you talk with people on other servers.

Kaiwa has some heavy requirements of a web server, ldap server, database, and xmpp server. ldap is a deal breaker for me. I would prefer storing user data by marking rusty nails that I must arrange myself in a garden of broken glass – because that would be far more pleasant than writing ldif files. The default kaiwa would lose all data when restarting docker even after mounting (looks to be an issue with the ldap container). I just gave up. Kaiwa also has no way to search history which is pretty important to me.


RocketChat is worth keeping an eye on, but is not finished or near feature complete as of this writing.

On just xmpp

XMPP is a standard for interoperable chat. That’s a great goal but also a big challenge. There is no good xmpp desktop client. None support images (Animated gifs I think are a big draw on why people like slack). The default chat client on Ubuntu, Empathy, hasn’t seen development in years.

Google used to champion XMPP but hangouts dropped most support except very basic one to one non federated chat. You suck Google.

Thanks to websockets it’s really easy to make chat apps. You can learn in a weekend. I made a simple app here with Django and Swampdragon. Learning XMPP on the other hand is hard. There are few resources and existing resources are out of date. XMPP tackles much more like federation so it’s unfair to compare it directly with websockets. Still, people are going to use what’s easy and that’s hosted proprietary services built on websockets. Hopefully open source can catch up. It would be great to see some crowd funded efforts in this space – I’d contribute.

The only time I will spam you

I’m applying to this $150,000 grant. We need 250 votes to be considered. Voting for us is an easy way to support projects like django-report-builder and django-simple-import.

We spin off these third party apps whenever possible. Burke Software’s main focus is an open source school information system. If anything on this blog has helped you please consider giving us a vote. Thank you!

Be warned the website requires a Facebook login. I apologize for that, I don’t even have a Facebook account myself and had to use my good old Testy McTest account to vote.

Docker in dev and in production – a complete and DIY guide

Docker is an amazing Linux containerization tool. At Burke Software, we moved our development environment to Fig months ago and are now using Docker in production as well. This guide should give you ideas. I’m going to cover a lot of technologies not related to Docker to give you an idea how we do things. In my examples I’m using which is on GitHub for learning purposes. You should be able to follow along and run the website in Docker! Talk about self promotion – did I mention we are available for hire?

Docker in development

In development we use Fig, a tool that makes Docker a bit easier to use. It’s great whether you’re a Linux admin, software engineer, or graphic designer with minimal command line experience. The Fig documentation is pretty good so I won’t go into running it. With Fig everything gets standardized and mimics production. It would be a lot to ask for a designer to run solr, redis, postgres, and celery. Fig lets you do this. If your production environment runs a worker queue like celery, so should develop. The most differences between development and production, the more opportunities for bugs.

Current state of Docker in production

Docker itself is stable and past the 1.0 release. However the tools around it are not. For nontrivial deployments you need a little more than basic Docker commands or ‘fig up’. Flynn and Deis looks REALLY cool but are not stable yet. There is mesos and shipyard and lots more. Running Docker by hand can be a bit daunting. The guide will focus on the by hand approach – with some tools to support it.


Let’s start with a basic server. I’m using DigitalOcean. If you like this post and start a DigitalOcean account please consider using this affiliate link. The cool thing about Docker is you aren’t tied to any one service as long as that service runs Linux. AWS? Microsoft? Your decade-old desktop lying around? Anything you want.

I use Ansible to provision my server. The idea here is it’s somewhat self-documenting and I can throw out my DigitalOcean account and start it up on EC2 on a whim. Here is my Ansible YML file. This is my file and not intended to just copy. It’s so that you to know how to use Ansible and get some ideas. I will refer to it often. Basically, any task I would normally do by hand I do via Ansible so it’s reproducible. I’m using a private Git repo so I am actually adding secrets here.

Docker in Production

Docker itself is installed via Ansible. I’ll follow the order that a incoming request to the server would take.

  1. An incoming request hits nginx installed on the host. Nginx is using proxies to route to a port on localhost that a Docker instance is listening to. The following is my configuration for Nginx has the task to route a request for to port 8002. Port 8002 was arbitrarily assigned by me.
        server {
        listen 80;
        access_log  /var/log/nginx/access.log;
        location / {
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  2. Supervisor – We need docker to be running and answer port 8002. We need an init system for Docker to run on and to respawn, restart, etc. Here is my supervisor conf file. WTF Fig in production???
    command = fig -f /opt/fig/ up
    stdout_logfile = /var/log/webapps/
    redirect_stderr = true
  3. Fig in production – This Docker blog post provides a basic overview of using Fig in production. While I prefer the Fig YML syntax over writing plain Docker commands, I still recommend taking some time to get familiar with Docker. You should know how to build, create, and remove Docker containers before going forward, because Fig won’t help you if things blow up. Once you have an understanding of Docker, though, you’ll find that Fig does make it very easy to connect and run various Docker containers. Here is my Fig production file:
      image: dockerfile/redis
      build: /opt/
      command: gunicorn bsc_website.wsgi --log-file - -b -n
        - /opt/
        - "8002:8000"
        - USE_S3=Yup
      mem_limit: 1000m
        - redis

    I’m saving the environment variables in the Fig file, which is in a private Git repo. I like tracking them in Git over something like Heroku where you just enter them without version control. Notice that port number again. I’m redirecting port 8002 to the container’s port 8000 – which is just the port I always use in Fig. It could be anything but I want to change as little as possible from dev. The mem_limit will prevent the container from eating all system RAM. Look how easy it is to get redis up! I can just as easily run celery workers and other junk just like Fig in development.

  4. Persistent data – At this point our request has hit the Docker Gunicorn server which will respond. Cool. However, what happens when the container is restarted or even destroyed? Fig can deal with this itself and make databases persist; however, I don’t trust it in production. I’d like to destroy the container fully and create a new one without losing my important data. You could use dockervolumes to mount the persistent data. I’m just going to run Postgres on my host. You could also use Amazon’s RDS or an isolated database server. I feed in the Postgres credentials via environment variables as seen in the fig file. I’m storing my user uploaded files in S3. In my Ansible YML file you can see I’m backing up my entire Postgres database to S3 using the python package S3-backups. The important thing here is that I can stop and rm all my docker containers, rebuild them, and it’s not a big deal.

  5. Updating production – I’m using Git hooks to update the server. I have staging servers here too. It’s nice to give your developers easy access to push to staging and production with just Git. Notice I started a bare Git repo in the Ansible YML file. I’ll use a post-receive hook to checkout the master branch, Fig build, collectstatic (Django specific), and migrate my database (also Django specific). Finally it will restart Docker using supervisor. The set -x will ensure whoever does the Git push will see everything in their terminal window. It’s a lot like Heroku or, more accurately, Heroku is a lot like a Git hook because it is a Git hook. Unlike Heroku I can install packages and run anything I want. 🙂

    set -x
    git --work-tree=/opt/$NAME/ checkout -f master
    fig -f /opt/fig/$NAME/fig.yml build
    fig -f /opt/fig/$NAME/fig.yml run --rm web ./ collectstatic --noinput
    fig -f /opt/fig/$NAME/fig.yml run --rm web ./ migrate
    supervisorctl restart $NAME

Hopefully all that gives you some idea of how we run Docker in production. Feel free to comment with questions!