Server side tracking with piwik and Django

Business owners want to track usage to gain insights on how users actually use their sites and apps. However tracking can raise privacy concerns, lead to poor site performance, and raises security concerns by inviting third party javascript to run.

For Passit, an open source password manager, we wanted to track how people use our app and view our passit.io marketing site. However we serve a privacy sensitive market. Letting a company like Google snoop on your password manager feels very wrong. Our solution is to use the open source and self hosted piwik analytics application with server side tracking.

Traditional client side tracking for our marketing site

passit.io uses the piwik javascript tracker. It runs on the same domain (piwik.passit.io) and doesn’t get flagged by Privacy Badger as a tracking tool. It won’t track your entire web history like Google Analytics or Facebook like buttons do.

Nice green 0 from privacy badger!

To respect privacy we can keep on the default piwik settings to anonomize ip addresses and respect the do not track header.

Server side tracking for app.passit.io

We’d like to have some idea of how people use our app as well. Sign ups, log ins, groups usage, ect. However injecting client side code feels wrong here. It would be a waste of your computer’s resources to track your movements to our piwik server and provides an attack vector. What if someone hijacked our piwik server and tried to inject random js into the passit app?

We can track usage of the app.passit.io api on the server side instead. We can simply track how many people use different api endpoints to get a good indication of user activity.

Django and piwik

Presenting django-server-side-piwik – a drop in Django app that uses middleware and Celery to record server side analytics. Let’s talk about how it’s built.

server_side_piwik uses the python piwikapi package to track server side usage. Their quickstart section shows how. We can implement it as Django middleware. Every request will have some data serialized and sent to a celery task for further processing. This means our main request thread isn’t blocked and we don’t slow down the app just to run analytics.

class PiwikMiddleware(object):
  """ Record every request to piwik """
  def __init__(self, get_response):
  self.get_response = get_response

def __call__(self, request):
  response = self.get_response(request)

  SITE_ID = getattr(settings, 'PIWIK_SITE_ID', None)
  if SITE_ID:
    ip = get_ip(request)
    keys_to_serialize = [
      'HTTP_USER_AGENT',
      'REMOTE_ADDR',
      'HTTP_REFERER',
      'HTTP_ACCEPT_LANGUAGE',
      'SERVER_NAME',
      'PATH_INFO',
      'QUERY_STRING',
    ]
    data = {
      'HTTPS': request.is_secure() 
    }
    for key in keys_to_serialize:
      if key in request.META:
        data[key] = request.META[key]
    record_analytic.delay(data, ip)
  return response

 

Now you can track usage from the backend which better respects user privacy. No javascript and no Google Analytics involved!

Feel free to check out the project on gitlab and let me know any comments or issues. Passit’s source is also on gitlab.

1 thought on “Server side tracking with piwik and Django”

Leave a Reply

Your email address will not be published. Required fields are marked *

*