Running Personal Projects on Kubernetes

Recently, I migrated the services of all my personal projects to Kubernetes on Google Cloud Platform.

Background

Over the past few years, I’ve been running increasing number of services of my personal projects in the cloud, written in different languages, running in from Shared Hosting to VPS on Linode, then EC2 on AWS and it becomes more and more difficult to maintain them on the same server.

I containerized all the services in January 2017 and was using docker-compose to manage them on EC2 instances, now they all run on 2 GKE clusters.

GKE

Among the few container orchestration systems and cloud providers, I’ve chosen Kubernetes and GKE mainly because:

Better pricing (instances, load balancers, etc. free master)
Monitoring and logging out-of-the-box (although Stackdriver UI is really bad)
Less vendor lock-in (compared to AWS ECS)
Active community and Google support on Kubernetes project
Platform support

Cloud SQL

I use Cloud SQL for services using MySQL, it can be connected directly or through the Cloud SQL Proxy, which can be run as a sidecar in the same Pod with the application.

Pricing

Pricing is crucial in my case since my services have a few thousand users and usually don’t receive a lot of traffic. However I want to create a way to easily maintain and deploy resilient services.

I found that the smallest practically usable instance type is g1-small, which costs around $13-$15 per month if run continuously during the month (Sustained Use Discounts) and you can run any number of nodes within your GCE quota.

f1-micro is technically possible (at least 3 f1-micro required to create a cluster), but 0.6GB of memory is simply not enough in many cases. I don’t care CPU much and can have many services sharing the CPU, but they can’t run if there’s not enough RAM.

Load Balancers (GCLB) cost ~$18 per month for first 5 forwarding rules.

GCLB uses a single Anycast IP and terminates connection in different regions.

Gateway

To save forwarding rules, I created a separate service called gateway to

Route all incoming requests
Terminate TLS connection from GCLB (GCLB doesn’t validate backend certificates)

It’s based on nginx. An Ingress resource is used to create a GCLB.

                   -------------------------------------
                   | K8s cluster                       |
Clients -> GCLB -> | Gateway Service -> Target Service |
                   -------------------------------------

Deployment

Currently I’m not maintaining or using any CI systems for K8s deployment since deployments are not super frequent. However, I use CircleCI to build, test applications and publish public images to Docker Hub and I use Google Cloud Container Builder to build and publish private images to Google Container Registry.

For services have multiple deployment environments, I use a template K8s manifest file and a shell script to render it with different image tags, resource names, global-static-ip-name, etc.

$ ./kube-deploy.sh dev
Deploying to dev environment...
Tag: commit-hash
Deploy? y
Applying changes...
deployment "example-dev" configured
service "example-dev" configured
ingress "example-dev" configured

The script automatically takes HEAD commit hash, validates the tag exists in GCR, checks the current kubectl context before running the kubectl apply command.

Persistent Data

Some services require persistent storage, they are provided with gcePersistentDisk volumes. A GCE disk will be automatically mounted to the node which the Pod is running on.

containers:
  - name: container-name
    image: image-name

    volumeMounts:
      - name: volume-name
        mountPath: /path/to/data

volumes:
  - name: volume-name
    gcePersistentDisk:
      pdName: gce-disk-name
      fsType: ext4

gcePersistentDisk can be mounted to multiple Pods in read-only mode at the same time, but only one in read-write mode, .spec.strategy.type=Recreate is used for services require read-write persistent storage. Therefore rolling-update is not possible in this case, but it’s not yet a problem for me. I will investigate more options in the future.

CDN

I used CloudFront for years and recently started having performance issues with users in China. I switched to Google Cloud CDN which works with GCLB and caching is controlled by Cache-Control header in upstream response.

Monitoring

GKE provides some monitoring and logging for your clusters, GCLB request logs, container logs and GCE instance metrics are all available out of the box. Log metrics are also supported.

Stackdriver Basic Tier currently satisfies my monitoring requirements. It has a shorter log retention and limit on alerting (only email and GCP mobile app), etc. But emails can be sent to Slack for notifications. (This is going to change from July 2018)