Open Credo

December 18, 2012 | Software Consultancy

Withstanding the test of time

The first thing most people think of when they start a project with the good intentions of test driven development is: write a test first. That’s great, and something I would fully encourage. However, diving in to writing tests without forethought, especially on large projects with a lot of developers can lead to new problems that TDD is not going to solve. With some upfront thinking (but not big upfront design!) a large team can avoid problems later down the line by considering some important and desirable traits of a large and rapidly changing test suite.


Gawain Hammond

Withstanding the test of time

All too often an eager team can code themselves into a cul-de-sac creating huge unmanageable test suites that are complex, have terrible performance, and are a huge time sink to improve. Many projects follow all the best practices such as dependency injection and use the latest frameworks, yet still have issues growing their test suite beyond a certain size.

This post (and the following two posts), will discuss some common pitfalls I’ve witnessed teams facing when doing TDD on large and complex projects.

The full article has been split in to three parts. Part One will lay the foundations for the next two parts and discuss common mistakes: Part Two will be about getting good performance from test suites. Part Three will bring all the ideas together and discuss how we can ensuring value in our tests and use them to design better software.

Types of Tests
First, it’s worth covering exactly what types of tests we can write that will make a difference to software teams. Brian Marick is often quoted as documenting a four of types of tests. In his book AgileTesting he discusses the agile testing quadrants:

Q1.) Technology facing tests that support development
A suite of tests written, executed and maintained exclusively by developers. This will typically be the fastest running suite of tests are used to ensure a very high, but not full level of confidence of deployed production code. Each test type is discussed in more details below.

Q2.) Business facing tests that support programming
Tests in this quadrant test test the functionality of the system are considered ‘acceptance tests’. Non functional tests are covered by the fourth quadrant of the diagram. These types of tests ensure the business defined acceptance criteria have been met by the software.

Q3.) Business facing tests that critique the project
Typically a manually executed suite of tests that will verify the application deliver the expected value to end users. This is not just about verifying the deployed application meets it’s specifications, it is also about making sure the defined specification are correct and make sense from the user’s perspective. Showcases are an example of this kind of test.

Q4.) Technology facing tests that critique the project
There two categories of acceptance testing: functional and non-functional. Non-functional refers to the operational qualities of a system other than it’s defined functionality. These are qualities such as extensibility, security, fault tolerance, maintainability, etc.. The *ility features. Personally, I prefer the term ‘quality attributes’ to ‘non-functional’ as it is a fallacy to think of these attributes of the system as not being something the business desires, and therefore not needing to be tested.

As this post is aimed at developers, this series of posts will only discuss tests that software developers will be most interested in: the tests that support programming, both the technology facing and business facing tests. Below, some of the most relevant test types are discussed in more detail.

Unit Tests
We all know unit tests intimately (or should do). They test small isolated units of code, often just a single class. Unit tests will cover as many failure scenarios as possible, ideally using the Right-BICEP mnemonic [From Pragmatic Unit Testing]: Unit tests inform us our code does the right thing, and that it’s convenient to work with.

A unit test is a developers chance to design code and experiment with the interfaces of objects, ensuring they are easy to use and provide clarity. Unit tests will not make remote calls to databases or http web services.

Unit tests will cover a good 80% of every code path in your system, but the speed of feedback come at the cost of missing bugs that occur as a result of interaction between various sub-components within the application. Which leads us to…

Component Tests
Components tests are used to describe what are often termed integration tests. As the term ‘integration’ is exhaustively overloaded, the term Component Tests has been coined [From Continuous Delivery 2011].

Component tests will verify multiple classes interact as expected when wired together, and as current trends dictate will often be spring based tests using the SpringJUnit4ClassRunner. These tests are often slower due to various factors, such as loading application contexts, communicating with remote services, or testing longer running behavior within the system.

Functional/Story/Acceptance Tests
For the sake of this discussion each of these tests are similar enough from a development perspective. A functional test ensures that your deployed code (typically in a runtime time like configuration) provides some value to the customer.

A Story is a business defined requirement giving the developer a starting state, a set of steps to replicate, and an expected outcome. These types of tests are typically very slow and can easily cross the boundaries of your system to remote systems. These tests are typically the most brittle.

Deployment Tests (or Smoke Tests)
The name come from how a plumber uses smoke in a newly plumbed installation to find any leaks in the system. These tests will be run against a production (or clone) deployment of your project, and will report back on how well the integration points of your system work.

Deployment tests give you confidence that all the protocols and hosts have been correctly configured for a production like environment.

All the tests together
A software project with a comprehensive test suite will have a mixture of these types of tests, and ideally will have mostly unit tests, a fair amount of component tests, and fewer higher level tests. All together, the ratios should look something like the test pyramid below. Of course judgement plays a large part of deciding the right ratio, and there are no hard and fast rules, just general guidance.
Any of the below tests can also be Specification Tests. There is nothing stopping a Unit Test from Using a Specification Test Framework such as JBehave or Concordion.

* UI/Smoke Tests 15%

* Component Tests 30%

* Unit Tests 55%

As the test become more high level they get slower and more brittle so these types of tests should avoid testing every possible failure scenario, and instead test the happy path. The Happy Path is the simplest path through the system to ensure something is working – i.e: the ideal scenario. As your tests gets higher level you gain the benefit of more confidence in the production readiness of your code and that it meets the needs of your customer.

Robust Tests
Some projects have tests that fail intermittently or after minor changes are made to the source code. There are potentially many reasons for this, though the most common are because either the tests are too brittle and testing behaviour too strictly, testing the wrong thing, or possibly because there is too much untested test code that deals with unrelated concerns. Each of these points are discussed below.

Brittle Tests
A brittle test is a test that fails after making minor changes to your code or another test. Some unit tests, especially with mocking/spying frameworks, test every single method call on every single dependency. This practice is brittle and will rapidly pour quick drying cement over your whole code base and leaves you with no flexibility for future changes in requirements. I wouldn’t suggest it’s never appropriate to verify/spy, but use judgement. Using spying to test method calls can be testing implementation, not necessarily the desired behaviour. Ideally you want to treat your class under test more as if they were a back box, (the interface): given certain inputs does it return the expected outputs? Most of the time you don’t care how the class gets it’s job done, unless it would be breaking some requirement to do so.

Testing the Wrong Thing
This is an extension of the issues discussed in Brittle Tests above. When you start to test implementation details, or features that are not specified by the business you are putting effort in to testing behaviour that has no business value. An example of this would be testing that a method does null checks for all arguments. Although it may seem perfectly rational to do this, if there is no need to make this check why bother writing the code and the test? An exception will most likely result regardless of whether you add this check, and you just made your test more brittle if a new requirement becomes incompatible with the redundant assertion. Tests should ensure value as well as quality, and it’s not always an easy balance to strike.

Bypassing business logic
It is very common to see a test suite grow rapidly doing all kinds of exotic things to verify expected behaviour in a large project. For example: verifying return messages on a broker, or verifying that some data was inserted to the database correctly. The more developers there are on a project the more these approaches are likely to deviate in each team and be duplicated. When testing data integrity within your application, ideally the application’s APIs should be used.

This way all the high quality code that has been written and tested can be re-used from within your tests. This also means you’re not bypassing business logic, which could later change but not be part of the test’s data insertion that bypassed the API by writing to the database directly.

Untested Test Code
If for any reason you need to access data from some remote service directly within a test, then it’s best to ensure the code responsible for doing so is kept in a common test-support module (such as a jar) that can be re-used and provide common utility to all tests in your suite. The other upside of doing this, is that you will naturally test classes in the re-useable test-support module, something that you are not compelled to do when you place test support code in the test source tree.

This may seem like an elementary suggestion, but it’s amazing how many experienced teams I have seen failing to consider this until the test suite code has grown to a size that makes it painful to refactor and a critical project is relying on reams of untested code. If you do find yourself in this situation, try to move the overgrown test code in the support module as quickly as possible, and in an incremental fashion.

That brings us to the end of Part 1. In Part 2 I will discuss strategies for getting the best performance out of a test suite.


This blog is written exclusively by the OpenCredo team. We do not accept external contributions.



Twitter LinkedIn Facebook Email