Open Credo

May 31, 2018 | DevOps

Self-testing infrastructure-as-code

As traditional operations has embraced the concept of code, it has benefited from ideas already prevalent in developer circles such as version control. Version control brings the benefit that not only can you see what the infrastructure was, but you can also get reviews of changes by your peers before the change is made live; known to most developers as Pull Request (PR) reviews.


Will May

Will May

Self-testing infrastructure-as-code

However, because PR reviews rely on fallible humans to pick up on small details, developers typically combine the review with CI, where the code is built and automated tests run before being allowed to merge the changes, also known as ‘self-testing code‘. In this blog post, I’ll demonstrate how similar testing ideas can be applied to infrastructure code, creating ‘self-testing builds’.

OpenCredo have long been proponents of the need to test your programmable infrastructure. This blog goes a step further by providing a concrete example and demonstration of what this looks like in practice. An example of the principles discussed within this blog post has been created at, where you will find code necessary to create a Vault cluster within AWS along with various tests. This repository contains the Packer code to build an AMI and the Terraform code to use the AMI to build a Vault cluster. The contents of this example repository will be referred to within this blog post.

Benefits of self-testing builds

While most will be familiar with the benefits of self-tested builds, it’s worth repeating them for those who have not applied them to operational concerns:

  • Provide feedback on the code written through code smells; if something is difficult to test, it probably means that the code needs to be rewritten.
  • Provide faster feedback on whether a change works properly or not.
  • Provide confidence that the system functions as expected.
  • Prevent functionality from being accidentally removed or broken.
  • Easier to apply patches, such as security updates, due to the confidence that can be gained before having to make potentially disruptive changes to any live system.

The lack of self-testing builds in DevOps can also lead to problems which are unique to the handling of infrastructure, such as the loss of quorum if a Consul cluster is broken while changes are being applied, or the destruction of a database.

Self-testing builds

In an ideal developer project, the construction and release of any build artefact will be controlled so that they can only be released once it has gone through testing. By way of an example, in a typical Java based project, tools such as Maven or NPM are often used to enforce this process so that the developers workflow is eased through automation and they are able to enjoy the full benefits listed above.

To gain these same benefits in the DevOps world, the first step is to have a tool which automates the process. The venerable Make tool has been used in the example repository for this purpose. The second step is to write tests to verify and validate the infrastructure-as-code works as expected, such as applying Terraform changes to an environment doesn’t break a service. To be able to fully realise the benefits listed above, these tests should cover all code — including branches and error handling — and will typically fall into two categories: ‘unit’ and ‘integration’.


A unit test is usually defined as testing isolable parts of the code individually, without interacting with other parts. The purpose of a unit test is to be able to quickly validate the functionality of the unit. This is done by ensuring that the test has control over any other units that interact with the unit being tested through mocking, so giving confidence over areas of the code where things start getting complex.

Identifying what units are suitable for testing, or provide genuine value for testing takes experience; whilst the unit test can give you great confidence that the unit is doing the correct thing, it’s interaction with other units within the system may not be correct.

Within DevOps

From a DevOps perspective, unit testing could be applied to testing a single machine image, custom scripts or whether a Docker image has been constructed correctly. Unit testing in DevOps has the problem that some areas are either difficult to isolate, or unit testing wouldn’t provide sufficient value; for example, unit testing Terraform code would provide little value as the tests would only be able to assert that the Terraform code contains what it is supposed to contain due to the lack of complex logic.


In the example repository, the only part unit tested is the AMI that is used to spin up the Vault cluster. The tests for this unit will ensure that Vault is installed correctly and verify any other requirements.

To run the tests, first build the AMI using Packer, and then use Terraform to spin up temporary infrastructure to allow serverspec tests to be run against the newly started AMI.

$ cd packer
packer$ packer build vault.json
packer$ cd tests
packer/tests$ terraform apply -auto-approve -var "artifact_under_test=ami-123456" -var "unique_identifier=$(whoami)" -var "ip_address=$(curl -f -s"
packer/tests$ terraform destroy -var "artifact_under_test=DELETING" -var "unique_identifier=DELETING" -var "ip_address="


An integration test is where multiple units of the code are pieced together and then tested as one. An ideal integration test will typically involving piecing the code together so that it’s as production-like as possible without including any external dependencies.

The purpose of integration testing is to gain confidence in how the individual units interact with each other, as they would do in a production-like environment. It also provides testing coverage of units that may have been seen as too simple to be worth unit testing.

Within DevOps

With DevOps, integration testing gives the ability to gain greater confidence that a change being made to an environment won’t break anything before the changes are applied to production or a system effectively operating as production.

A basic DevOps integration test would involve spinning up the infrastructure from the new code and then verifying that the service the environment provides, such as database; VPC; or Vault cluster, functions correctly.

A more advanced DevOps integration test would aim to include temporal tests; i.e. tests that ensure the service functions correctly while and after the changes have been applied; to ensure no loss of data, permanent or temporary, or availability while the upgrade is occurring. This would involve first spinning up a new production-like environment and then applying the changes to that environment, all while verifying that the service still functions correctly.


In the example repository, there are two integration tests under tests/ which will test the Vault Terraform code to give confidence that the changes can be safely applied to production without experiencing a loss of data or availability.

The first test, should successfully spin up a brand new infrastructure, is more of your typical integration test which ensures that the infrastructure can be spun up from a clean slate which might be used in disaster recovery of production, creating a new test environment or, more frequently, in the second integration test.

The second test, should successfully upgrade existing infrastructure, is designed to ensure that the Terraform changes that will be applied to production can be applied safely and without destroying any data or causing loss of service as the changes are applied i.e. it is a temporal integration test. This is done by first creating a production-like environment, using the latest tested machine image, and then applying the changes to the Terraform code contained with the repository while continuously monitoring the service to ensure it stays functional. Vault is monitored by spinning up a separate thread which is responsible for continuously attempting to read something that was previously stored in Vault, as shown below.

class VaultHelper < AsgHelper


  def start_monitor
    @thread = Thread.start { monitor }


  def verify
    puts 'Verifying Vault'
    vault = secure_vault
    actual_secret ='testing/test').data[:value]
    raise "Stored secret incorrect! Actual: #{actual_secret}, expected: #{@random_string}" \
      unless actual_secret == @random_string

  def monitor
    loop do
        sleep 5
      rescue StandardError => e
        puts "Failed when monitoring Vault: #{e}"
        @failures += 1


To run the tests, first fill out the tf-vars.json file and then run rake in the tests/ directory with the relevant environment variables; TF_PROD_VARS_DIR is the directory containing tf-vars.json for the production-like environment, TF_PROD_ENVS_DIR is the envs/ directory from the production version of the code.

cd tests
tests$ TF_PROD_VARS_DIR=/prod/version/of/repo/tests/spec TF_PROD_ENVS_DIR=/prod/version/of/repo/envs bundle exec rake

Putting it all together

Now we have a number of tests that we can run to ensure any changes aren’t going to break anything, we need to have a way to run through these steps automatically and in the example repository, there is a simple Makefile. This Makefile can then allow anyone making changes to the repository to easily ensure that nothing is broken, but also can be used in a CI server to give feedback on any changes that are being made. Note that a second copy of the repository will need to be checked out when running in a CI to represent the ‘production’ version of the code to facilitate the second integration test.


We showed in this blog post how creating self-testing builds for infrastructure-as-code is simple to achieve using Ruby but other languages could be used, such as Go using the recently released Terratest library. By adding these tests, we gain confidence when making further changes to the environment.


This blog is written exclusively by the OpenCredo team. We do not accept external contributions.



Twitter LinkedIn Facebook Email