Immutable infrastructure is an approach to DevOps that leverages immutability where. Immutability is a concept that has become common in the software development world. Originally a major component in functional programming, many languages and software frameworks have adopted immutability as a means of controlling complexity and reducing bugs, especially in concurrent systems. Many of the same techniques can be applied to the DevOps world.

Before we can properly dive into how immutable infrastructure works, let's talk about standard operations workflows based on mutable infrastructure.

Mutable infrastructure

Let's describe a common, non-cloud deployment scenario for an application. Typically, a sysadmin will either requisition a physical server, or rent a Virtual Private Server to host the software. The sysadmin will install the base OS, log in remotely, configure the relevant services, copy over the application to be run, and run it. If the sysadmin is well disciplined, he/she will additionally take detailed notes on this process.

In the future, when a new version of the application becomes available, the sysadmin will once again log in, download the updated application, and run the new version. Again, if disciplined, he/she will take notes on how this upgrade process goes. A similar process applies to updating packages on the base OS, such as for security updates.

Later, when the load on the server increases, the sysadmin will set up a second machine, following the same procedure followed before. Ideally, those detailed setup notes will now come in handy, and an almost-identical machine will be ready to pick up some of the incoming requests.

Problems with mutability

Unfortunately, this approach leaves a lot to be desired.

Some of these problems can be mitigated with tooling like configuration management systems. These systems try to provide a declarative interface for specifying the state you want the machine to be in, and perform appropriate installation or upgrade as necessary to get to that point. While this mitigates some of the problems above, it doesn't address all of them. And these systems have a tendency to introduce their own instabilities in your system, leading to costly downtime.

The cloud revolution

Moving to cloud computing changes the way we look at servers. In the scenario above, a server is an investment. We spin it up, we set it up, we care for it, we repair it, and so on. Not so in the cloud world. In the cloud, a server should be a cheap, replaceable commodity. The cloud focuses on programmatic automation of resources. Starting or stopping a machine isn't a notable event; it's a normal part of day to day administration.

In the cloud world, it's common to use automation tools that create machines for us. Those tools may adjust the sizes of our clusters. Cloud services like Auto Scaling Groups will automatically increase a cluster size based on load, and in turn decrease the cluster size when that load disappears. And health checks will automatically shut down malfunctioning machines, to be replaced with fresh, healthy machines.

In order to make this all work, setting up a new machine must be fully automated. We can't require sysadmin involvement each time a new scaling or health check event occurs. Much of this can be performed with provisioning scripts, which can use the same configuration management systems mentioned above. However, this has the downside of potentially failing intermittently, or being dependent on resources that may change over time.

Machine images

Instead, in our immutable infrastructure world, we like to rely upon machine images. Provisioning scripts take a vanilla, bare-bones OS installation and customize it at deployment time. Immutable infrastructure approaches move this setup to build time. We take some base OS image, run the setup scripts, and then capture the result into a complete machine image. This helps in many ways:

Typically, the creation of these machine images will occur in a Continuous Integration environment. The full battery of integration tests for the application can be run against this image. Once the image is vetted by these tests, and potentially by a manual Quality Assurance signoff, it can be uploaded into cloud storage, and the cluster can be moved over to the new machine image.

This process can be a bit more work to get set up with initially versus a provisioning step. And compared to the pre-cloud scenario described above, it's a significant mental shift. That said, once set up, an immutable infrastructure approach reaps large rewards in maintainability, responsiveness of the cluster, and human effort.

Docker images

With the popularity of containerization and orchestration tools like Kubernetes, a more lightweight version of the machine image approach is possible. Instead of creating a brand new machine image, and needing to request new cloud machines to deploy updates, Docker images provide for an immutable image format that can be run on existing machines. Docker images can typically be created faster than machine images. It's possible to run multiple Docker images on a machine, allowing for an easier path to zero-downtime blue/green deployments. And with Kubernetes, you can efficiently pack multiple services onto a single node to reduce server costs. Tools like Minikube make it possible to test complex deployment scenarios on a local machine, speeding up your Quality team.

The individual machines running the Kubernetes nodes can still rely upon immutable machine images to provide the host OS that will run the Docker images. However, these images can change less frequently, allowing you to keep your cloud machines over a longer period of time and reduce the churn of creating and testing those images.

Configuration management tools

It may seem that, in this world of immutable infrastructure, there is no room for configuration management tools. Their primary function is to handle the mutation of existing machines towards a specific state. However, when creating a machine or Docker image, we still need some way of configuring the base OS and installing additional software. It's possible to do this with simple shell scripts. However, configuration management tools can provide some advantages:

Fortunately, configuration management can work hand-in-hand with immutable infrastructure. You can use your existing scripts when creating your images. You won't be using the full power of configuration management, since you'll always be starting from a pristine image state. But you can look at this as a positive: you're less likely to end up in an indeterminate and buggy state.


Immutable infrastructure underlies much of what we do at FP Complete in our DevOps practice. If you want to learn more, check out our DevOps syllabus, training offerings, and our consulting services. To summarize the recommendations:

Share this