This post spawned from an assignment I had during my Master's degree. We were to describe how a system could be developed using continuous integration (CI). So let us start by defining continuous integration, wikipedia uses the following definition:
the practice of merging all developer working copies to a shared mainline several times a day
Meaning that what you develop ends up in the main branch during the day. You do not sit with code on your own PC for 2 weeks and then start merging. As this often causes a mess when merging. Larger features or very big changes might still be developed in a separate branch. But it has to be a major change before this is viable. When doing CI there should be little reason not to check in every day. Since the code is verified by the CI server. We will get back to this.
Since you are here, I believe I do not have to convince you why CI may or may not be a good idea. You probably wish to implement it somewhere, and just need some advice on how to get started. Let us go over some of the needed tools and mindsets for CI.
Yes. In order to do CI you need to use source control. A popular one is Git. But any source control can be used. The source control system makes it easy to share and merge code. It also contains a history of changes which makes it easy to track and go back (revert). Most important to CI is that it contains the "newest" (main) version of the code. This is the codebase that we wish to integrate with. Every time we push a new change we wish to merge it with the current version. This gives us a small "increment" that we can now deploy (when we have set up our CI and Continuous delivery).
Quality tests and stability
In order to reap the benefits of CI there are several things you need to have in place. First of all your team must be able to write quality tests. This makes sure that you have a stabilized codebase to build upon. When you integrate often (do CI) then you wish to know when something breaks. CI in itself does not have any manual testing in it (it does not exclude manual tests, later in the process). So in order to make sure you do not break anything when you push your code, you need automated tests. Therefore, before you can begin with CI you need to have a stable codebase. You could write tests as you go. Meaning that every time that you are making a change you will cover the old code in tests. So that you know how that works. Then you begin refactoring (change) the code. A great book on refactoring is Martin Fowler's Refactoring:
You most likely thought that you would begin CI by setting up software and server. But there are many disciplines that needs to be followed first.
You may argue that this section should be after the implementation of a CI server. However knowing what you are going into is important. It is also important to try and adopt the mindset as soon as possible. Having some developers using the CI server and others not is a disaster. The same is true for the mindset that is needed when doing CI.
There is a couple of things that the developers need to adhere to when doing CI. First of all you wish to have small increments (check ins). You do not wish to save up several weeks of code and then push it. We've all pushed our changes late Friday and because of that had to stay an extra hour (weekend?). Smaller merge conflicts are also easier to solve. Small increments are easier to merge, and when you do CI the CI server will let you know if you broke something (as long as your code is covered in tests).
The build will fail from time to time, this happens. The most important thing is to fix this as soon as possible. When you are doing CI you wish to be able to deploy fast. If the built is broken and you figure out you have a bug in production. Then you cannot deploy a fix until the build is green again. Therefore it is the number one priority of the team to fix the build - keep it green!
At some point you will feel very confident deploying your code and making changes (refactoring). Because the build server will let you know if your code breaks something. This let's you deploy more often.
The CI server
Here we go. Now that we know what we are going into we can start looking for a CI server. In this post I call it a CI Server, it is known as many other things such as: Continuous integration system/server, Build Server, Deployment server and so on. Here I will use the term "CI Server" - even though I mostly use build server in my daily work. I will go over it's capabilities in this chapter. The CI Server has 4 main capabilities:
- Fetch the latest increment: The CI server needs to be able to get the latest increment - which is the newest version of the codebase.
- Compile and build: When the increment is fetched (from your source ceontrol) it is then built. This step might be larger or smaller depending on your programming language. Some languages are not compiled making your second step obsolete or very small.
- Run automated tests: When the code has been fetched and is ready to be run, we run our automated tests against it.
- Deployment: If the code passes the tests it is deployed.
Finding the CI server for you
There are several different CI servers to be found. Here I have created a list of CI software. The list is based on what I believe to be the more popular automation tools:
- Jenkins: Probably the most popular on the list and the only one I have never used. However Jenkins is very popular and is open source written in java. It is used for all kinds of development stacks.
- VSTS/TFS: The obvious choice for anyone using the Microsoft stack. Team Foundation Server (TFS) and Visual Studio Team services (VSTS) are a lot alike. However VSTS is cloud based and TFS is selfhosted with own SQL server - meaning it requires more to be set up and maintain. If you are into .Net and MSSQL then TFS or VSTS will be your first choice as this is where the biggest Microsoft community is.
- Bamboo: Who has not used one of the following: Jira, Confluence, Hipchat, Bitbucket, Source tree or Trello? Atlassian has made some great products for development teams for as long as I can remember. If you are already on the atlassian stack you should consider using bamboo. Bamboo integrates easily with the rest of the Atlassian stack.
- GOCD: About a year ago I learned about GO continuous delivery. It is a product created by thoughtworks (who also created Cruise Control). What I really like about GOCD is the simplistic design and easy to set up pipelines. However you may ask, why GOCD?
Something that cannot be stressed enough is that getting all this done takes time! It can be a huge time investment to get started on CI. Do not underestimate this. But the rewards are there if you get everything set up and people start using it. Once you have found and set up your CI server, you need to configure it. Often you can find scripts online or the server comes with deployment scripts. Besides pointing to the right servers you may need to find scripts that can run the unit testing framework you need or can fetch the code from the source control you have. Alas, there is a ton of configuration to be done. One thing that you may feel that you are spending too much time on. Is writing code (scripts) for your CI server and not actual production code. But this is of course a necessary aspect.
Here is a quick summary of the steps mentioned above:
- Source control: Have the code you wish to integrate under source control.
- Establish a stable baseline: Have a stable and tested codebase
- Create a CI mindset: Make everyone ready for CI (knowledge sharing and creating the needed mindset).
- Find CI software: Find the CI software you wish to use.
- Setup and configure CI Server: Setup the server and configure it. You often end up creating your own or your own verion of scripts used in your CI.
- Start using the CI server: I would not advise migrating everything you have to the new CI environment as soon as it is set up. Start by getting used to it and build on it slowly. When confident, move everything reasonable.