Heili CI/CD - part 1

part 1

Heili CI/CD

How do we build things in Heili?

We are moving towards fully automated process of build, test, deploy and verify instead of manual build/deploy. It allows us to focus on development and not CI/CD process. I’ve been working for a last year, helping developers in different companies to deliver their product safely to production. I’ve seen dozen of variations of how people approach it. From manual FTP uploads to fully automated deployments with different automation tools. Loy’s of times it’s a Jenkins (or Hudson) server involved with bunch of bash/perl/python/ruby scripts. Sometimes it’s a cron that deploys local things and sometimes it’s just a manual copy/pasted rsync commands. So some manual labor is still involved, most of the developers I’m working with have moved to more automated setups. Mostly Jenkins pipelines that involve Ansible/Spinaker or some fancy kubectl/helm deployments.

But running your own CI/CD server ( Jenkins or similar) to manage the process can be a waste of resources, specially if you have small team and do not run 24x7 development lifecycle (yes, you can stop/start it, but do you have attention for it?). There are more as a service solutions that can solve this problem - like Gitlab CI, Codefresh and many others. Most of them use Docker as backend and have native support for Kubernetes deployment, which was exactly what we needed in Heili, but I didn’t like one thing - I need to trust one more third party. The issue is not only security, they are investing in it and usually do their best to prevent the code leak. I’ve to trust that their service will be up when I need it, will be fast enough and at some point the free plan is not enough so i will have another vendor to budget and pay (modern day problems).

Lucky for us, at this point we have migrated Heili to Google Kubernetes Engine. GCP made major change for their build service - Container Builder and renamed it to Cloud Build. Cloud Build is doing exactly what we needed - build our code, docker images and deploy them to Kubernetes cluster (and not only). Using GCP service account Heili Kubernetes access stays inside GCP - less security issues and we already paying Google.

How DOes it work

The flow is simple - you have Git repository with source code that you want to compile (or not if you use scripting language like python), put inside docker image and deploy. To do this you just add special file - “cloudbuild.yaml” to your repository with the exact steps. Here is simple file we were using at early steps:

steps:
# Build and push the docker
- name: 'gcr.io/cloud-builders/docker'
  args:
  - 'build'
  - '-t'
  - 'gcr.io/$PROJECT_ID/grandmaster:$TAG_NAME'
  - ‘-t’
  - 'gcr.io/$PROJECT_ID/grandmaster:latest'
  - '.'
- name: 'gcr.io/cloud-builders/docker'
  args:
  - 'push'
  - 'gcr.io/$PROJECT_ID/grandmaster:$TAG_NAME'
# Deploy to Heili cluster
- name: 'gcr.io/cloud-builders/kubectl'
  args:
  - set
  - image
  - deployment
  - grandmaster
  - grandmaster=gcr.io/$PROJECT_ID/grandmaster:$TAG_NAME
  env:
  - 'CLOUDSDK_COMPUTE_ZONE=us-central1'
  - 'CLOUDSDK_CONTAINER_CLUSTER=heili-us-central1'
images:
- 'gcr.io/$PROJECT_ID/grandmaster:latest'
- 'gcr.io/$PROJECT_ID/grandmaster:$TAG_NAME'

I’m not going to go over each step and what it does in detail. It is all well covered in Cloud Build documentation. Here is overview: Each step is running Docker container. In the above example we have 3 steps and use 2 docker images. First step builds Docker image using Dockerfile that we have in the repository. Second step is pushing it to Docker registry and third one initiates deploy to GKE cluster. Last step of the job is push the created images as artifacts to Docker registry (note that first push is only for image with “$TAG_NAME”, we need it before deploy job so the deployment will succeed immediately).

This flow can be extended in variety of ways - like checking if deployment succeeded or sending notifications.

Google Cloud Build supports builds that can be triggered manually or on git (GitHub, GitLab are supported) commit or tag. Triggered builds have special environment variable that can be used in the flow (like the “$TAG_NAME” or “$COMMIT_ID”).As you can notice, we were using Git tags for manage our deploys.

Not everything is gold

As the flow works, each new tag we add is automatically deployed. But (and there is always a but) - we had two big issues.

  • Every time we added new project, we had to configure new trigger in GCP and connect it to Github project.

  • Sometimes Cloud Build ignored Github hooks on new notifications / and was not cloning the repository to get new tags.

While first is something we can live with, second one was pretty annoying.

The rescue came from Github Google Cloud Build application that can automatically add all repositories (or manually specific that can be chosen in Github console). It never missed the triggers so far. The downside is that it’s triggered on every commit to every branch and that cannot be changed, so no more tags.

In the next part I’ll show our real life cloud builds with more advanced usage of Google Cloud Build functionality and how we are dealing with multiple environments.