What are the advantages and disadvantages of continuous integration?

This post spawned from an assignment I had during my Master's degree. We were to describe how a system could be developed using continuous integration (CI). I had for a while been gathering research on a post on CI. So after the assignment I decided to gather these and put them into a blog post format. Let us start by defining continuous integration, wikipedia uses the following definition:

the practice of merging all developer working copies to a shared mainline several times a day

Meaning that you commit your code and integrate with the main branch often. In this post I will bring some advantages and disadvantages of CI. These merits and flaws are from my own experience. I use CI in my professional career - and have almost always used it (hint: my post might be biased).

In this post I use the term "CI server". This has many names such as build server, deployment server and so on. Here I am referring to software such as Jenkins, Bamboo or TFS. Let us take a look at some advantages:

Advantages

The first thing on the list of advantages is "Automation of deployment". Manually moving code between environments is tedious and error prone. When the newest build is automatically deployed there is little chance of errors in the deployment. You may have to push a button to push it between the environments - but the rest is taken care of. The CI server also "handles different environments and configurations". If you were to move your code manually, you would have to make sure you do not change any configuration. So for example - you do not target a test database from production and vice versa. The CI server just needs the different configurations or needs to know where to find them. Again this reduces the amount of manual work.

By exercising CI we are also more likely to "integrate smaller increments" - and by that also integrating more often. You do not sit around with code for several weeks and then at the end push to the main branch. You push your changes often, which makes the merge conflicts become smaller and more manageable. There is no big bang here. This also forces everything to be "under source control"!. Which has several advantages to it in itself. Having history, possibility of reverting changes and an easy way to merge changes with others.

You also get the advantage of "releasing often". Small changes can go into production fast as long as they pass your checkpoints. In some CI setups you might not notice that your new change goes to production. Unless when you break the build of course - and then the increment will never reach production. Which is another great feature of the CI server, you will know when you break something. This could be the compilation of the code but also the tests you are warned about.

Automated tests are important in CI. There are many different kinds of tests, most widely known are integration and unit tests. Your CI server should automatically run these and let you know about the results. If your code passes the tests it is ready to be deployed. Due to us having small increments and a large tests suite we have great confidence that our changes work and do not break anything. If they do, we will be notified. Writing tests will most likely also result in the code being more SOLID. We can also have great confidence in our deployment since it is done the same way every time. Since there are no humans manually deploying changes (we often do it differently). Which makes it less prone to errors.

Another type of quality measurement (besides tests) is the static analysis. Many CI servers can analyse the codebase and give you some metrics. This is often the same as what you would get from your IDE - Integrated Development Environment (Visual Studio, Intellij Idea, Eclipse etc..). Finding anomalies in the code like unused variables and unreachable code.

We are now at the last point on my advantages list. Which is "one truth". When you build, deploy and control everything through the CI server it becomes the one truth. You get the full history there and there is no "works on my machine". If it fails on the build server something is wrong - period. It also lets you know when what and where it was deployed at any given time. This overview makes it much easier to track errors.

Disadvantages

An advantage that can also be seen as a disadvantage is the CI mindset. If no one on your team has worked with CI before, it might be tough to adapt. Suddenly they have to deploy through the CI server and some feel a loss of control when doing this. Because what actually happens after the check in is obfuscated (automated). The build might also break on the server and not locally which creates great frustration for some (works on my machine eh). These are things that you need to overcome and the whole team needs to embrace doing things a new way. Everyone has to be on board. Another example would be if someone in the team deploys a change manually (unscripted). Then the next time the CI server automatically deploys it will overwrite this change.

Something that is a certain disadvantage is that the CI server needs maintenance. It requires to be updated for new features, disks get full or the user running the server needs autorisations. More servers and software always mean more maintenance. Another negative thing about the CI server is that it becomes a "one point of failure". If your CI server stops working everything stops. You can still work on your local PC, but you cannot deploy. If you need to fix something urgent in production you have a problem. It might not just be the CI server, but one of its dependencies that fails. As it depends on many things often. Such as the fileshare where it places the build is unreachable, or it cannot pull the latest increment (for some reason).

The biggest nuisance about CI (that I find) is the amount of time invested in creating scripts - and maintaining these. These scripts contain code that needs to be maintained and developed, just like your regular codebase. Often you can find off the shelf scripts that you can just directly use. But often you would wish to make small adjustments to these, and suddenly they are not off-the-shelf anymore.

Summary

Advantages

  • Automation of deployment
  • Handling of environments and configurations
  • Integrates small increments
  • Requires everything to be under source control
  • Makes it easy to release often
  • You will know when you break something
  • Automated tests (testable codebase)
  • Greater confidence when making changes
  • Deployed the same way every time
  • Static analysis
  • See, what when and where something was deployed

Disadvantages

  • Requires everyone to be on board (CI mindset)
  • Needs maintenance
  • One point of failure
  • Time investment in development of build/deployment/test scripts

That's it

When I was writing the above I was about to put time consumption under disadvantages. But I see it more as an investment that gives returns. But nevertheless you should not underestimate the time it takes to get started on this!

As you might see, I find more advantages to CI than disadvantages. I also find more and more adopt CI. I myself have deployed very few applications by hand in my professional career. Almost every project I have been on has had automated build, deployment and testing. Those that did not are often a huge pain to adjust to (for me at least). CI also makes most sense when you have a team and not as a lone programmer. Even though you can still do it - with some gains.

I hope you liked the post, if I missed anything, let me know in the comments!