Working Locally with Microservices

The pre-pipeline (local) development process

If we look at a typical build pipeline for a monolithic application we can see that the flow of a single monolithic component is relatively easy to orchestrate. The pipeline obviously gets more complex when we are dealing with microservices, and we’ll cover this in future blog posts, but for now we will look at the pre-pipeline local development phase that will most likely involve working simultaneously with multiple dependent services.

When working with a monolithic codebase we should be able to assume that the creation and configuration of a local development machine is as least as easy to configure as a QA environment (and if it isn’t, then you should be asking why!). Local developer machine configuration tooling such as Github’s Boxen (Puppet), Pivotal’s Sprout (Chef), or mac-dev-playbook (Ansible) allow us to specify the installation of local tooling and standardise configuration, or alternatively virtualised local environments through the use of Vagrant or Docker Compose (née Fig), are relatively easy to set up for a single stack.

Within the world of microservices the tooling for these development tasks often exists at a component/service level, but things get a lot more challenging when the number (and variety) of services increases, the number of dependencies between services increases, or when a complex business workflow takes place at a higher (system) level that involves several coordinating services.

Five patterns for working locally with microservices

1) The naive (painful) approach

In my experience, when most developers start working with microservices they simply attempt to replicate their local development practices for each new service. I believe this is a logical approach (and I’ve done it!), but as with many things within computing – manual replication only gets you so far.

The biggest problem I have encountered with this style of working is the integration costs of testing. Even if each service has integration/component-level testing, it can be very difficult to coordinate test configuration and initialisation once you develop more than a few services. You’ll often find yourself spinning up an external service locally (by cloning the code repository from VCS, building and running), fiddling around with the state, running tests on the service code in which you are developing, and finally verifying the state of the external service.

In the past I have seen many developers attempt to overcome this problem by creating simple scripting files (bash, python, ruby etc) that wire everything together and initialise data for tests. In my experience this quickly becomes a nightmare to maintain, and isn’t a recommended approach – I only include it here as a baseline, and so read on to learn about a scalable approach…

2) Local ‘profiles’ combined with mocking and stubbing

If you are familiar with developing code using the JVM-based Spring framework (or the maven build tool) you will instantly recognise the concept of ‘profiles’, but this pattern in present across many language stacks (e.g. Rails’ ‘RAILS_ENV’ or Go’s envconfig). Essentially profiles allow multiple configurations to be developed and switched at build or run time. This will allow you to develop mock or stub implementations of external service interfaces for local development, and switch this version and the actual production implementation as required.

I have used this technique with great success when developing a Java-based ecommerce ‘shop-front’ service that was dependent on a ‘product-search’ service. The interface for the product-service was well-defined, and we developed several profiles for use when running automated tests via maven/rake:

‘no-search’ – this profile simply mocked the product-search service as a no-op (using Mockito or RSpec Mocks), and returned empty results. This was useful when the locally developed code was interacting with the product-search service, but we didn’t care about the results coming back from a call.
‘parameterised-search’ – this profile contained a stub implementation of the ‘product-search’ that we could parametrise in our tests to return various search results (e.g. one product, two products, a specific product with property X, an invalid product). We created the stub implementation simply as a Java class, and loaded the pre-canned search results from an external JSON data file. This is a very useful pattern, but if the stub begins to become complex (with lots of conditionals) then it may be time to look at the ‘service virtualisation’ pattern below.
‘production’ – this was the production implementation of the product-search interface that talked to an actual instance of the service, and undertook appropriate object marshalling and error-handling etc.

Although not exactly stubbing or mocking, I’m going to also include the use of embedded or in-process data stores and middleware within this pattern. Running an embedded process will typically allow you to interact with this component as if you were running a full out-of-process instance, but with much less initialisation overhead or the need to externally configure the process. I have had much success with using H2 as a test replacement for MySQL, Stubbed Cassandra for Cassandra, and running an embedded ElasticSearch node.

3) Service virtualisation

When mocking or stubbing external services becomes complex this can be the signal that it would be more appropriate to virtualise the service (and as an aside, if your stubs start to contain lots of conditional logic, are becoming a point of contention with people changing pre-canned data and breaking lots of tests, or are becoming a maintenance issue, this can be a smell of too much complexity).

Service virtualisation is a technique that allows us to create an application that will emulate the behaviour of an external service without actually running or connecting to the service. This technique allows for the more manageable implementation of complex service behaviour that mocking or stubbing alone. I have used this technique successfully in a number of scenarios, for example, when a dependent service returns complex (or large amounts of) data, when I don’t have access to the external service (for example it may be owned by a third-party or is run as a SaaS), or when many additional services will interact with this dependency and it will be easier to share the service virtualiser than mock/stub code.

Tooling in this area includes:

Mountebank – this tool is a JavaScript/node.js application that provides ‘cross-platform, multi-protocol test doubles over the wire’, and I have used it to virtualise services that speak HTTP/HTTPS and TCP (it also supports SMTP). The API is easy to use, and although some of the code you may write may look verbose, it is easy to craft complicated virtualised responses.
Wiremock – this tool is similar to Mountebank in that it works by creating an actual server (HTTP in this case) that can be configured to respond with a range of virtualised responses. Wiremock is written in Java, and is well-supported by fellow Londoner and open source advocate Tom Akehurst. Wiremock really comes into its own when it is combined with another of Tom’s tools, Saboteur, which allows you to programmatically and deterministically inject failure into the network stack (a’la Netflix’s Simian Army). As demonstrated by yet another London open source testing and Cassandra wizard, Chris Batey, this allows you to test your service’s failure and error-handling by simulating problems with services that you may not even own.
Stubby4j – this is a nice Java-focused tool that shares a lot of similarity with Mountebank and Wiremock. I personally haven’t used this tool much, but several of my OpenCredo colleagues have Stubby4j to emulate complicated SOAP and WSDL messages when interacting with external legacy service.
VCR / Betamax – these are both very useful implementations of applications that allow you to record and replay network traffic. I have found these particularly useful when I don’t have access to the code of the external dependent services (and therefore I can only observe a response from my request), when the service returns a large amount of data (which I can capture in an external cassette), or when making a call to the service is restricted or expensive.
Hoverfly (a new tool built from lessons learnt with Mirage) – This is a new service virtualisation tool that provides additional configuration options over Wiremock and VCR, and we have used this within several successful projects to emulate responses for complicated legacy applications (we’ve all got these, right?) as well as complex microservice architectures with many interdependent services. We have also used Hoverfly when performing load testing, where an external SaaS-based application test sandbox that was on the critical path wouldn’t allow us to ramp up the number of test requests without becoming the bottleneck itself. The fact that Hoverfly is written in Go means that it is very lightweight and highly performant – we easily get 1000’s of request/responses per second when running on a small AWS EC2 node. In the interest of transparency, this is a tool that is being actively developed SpectoLabs, a company that has been spun out of OpenCredo.

4) ‘Production-in-a-box(es)’

This pattern enables a developer to download pre-canned images of services to a local machine that can be easily executed for development against or for running tests. We started doing this initially with HashiCorp’s Vagrant, where we could create a preconfigured vbox image that contained an application’s code/binaries alongside an OS, configuration and associated data stores, which was shared around the development team. The arrival of Packer made the image creation process even easier, and also gave us the ability to specify application packaging once, and re-use this across environments (e.g. AWS in production, OpenStack for QA, and VirtualBox for local development).

Arguably the arrival of Docker massively promoted this style of application packaging and sharing, and the Fig composition tool was the icing on the cake. Fig has since evolved into Docker Compose, and now allows the declarative specification of applications/services and associated dependencies and data stores. This pattern does allow for the very flexible execution of a collection of dependent services on a local development machine, and the main limiting factor in our experience is machine resources (particularly when running hypervised platforms).

The ‘production-in-box’ pattern also allows us to keep a much cleaner local developer environment, and also removes potential configuration clashes by encapsulating a service and its dependencies and configuration (e.g. different requirements of Java version). We can also parametrise the images (through initialisation params or environment variables), much like we did with the ‘profiles’ pattern above, and allows services to behave as we require. We have successfully used Docker plugins for both maven and Ruby/Rake, which enable the integration of container lifecycles with test runs.

A potential extension to this pattern is developing within the actual images themselves, for example by mounting local source code into the running instance of an image. If done correctly this can remove the need for the installation of practically all tooling on the local development machine (except perhaps your favourite editor or IDE), and greatly simplifies the build toolchain (e.g. you don’t have to worry about GO_PATHS, or what version of Python you are running). If you are running a compiled language then it is possible to create a dev and production image via a build pipeline, which contain the source code or only the linked binaries respectively.

5) Environment leasing

In a nutshell the environment leasing pattern is implemented by allowing each developer to create and automatically provision their own remote environment that can contain an arbitrary configuration of services and data. The services and data (and associated infrastructure components and glue) must be specified programmatically via Terraform or one of the CAPS tools (our current favourite is Ansible) and the knowledge shared across the team for this approach to be viable, and therefore you must be embracing the DevOps mindset. A local development machine can then be configured to communicate with dependencies installed into the remote environment as if all the services were running locally. We have used this pattern when deploying applications to cloud-based platforms, which allows us the spin-up and shutdown environments on demand.

The ‘platform leasing’ pattern is an advanced pattern, and does rely on the ability to provision platform environments on-demand (e.g. private/public cloud with elastic scaling), and also requires that a developer’s machine has a stable network connection to this environment. We have also found running a local proxy, such as Nginx or HAProxy in combination with HashiCorp’s Consul and consul-template, or a framework such as Spring Cloud in combination with Netflix’s Eureka,is useful in order to automate the storage and updating of each developer’s environment location.

Summary

This article has attempted to summarise our learnings over the last five or so years of locally developing and working with microservice, and is part of our “Microservice Platforms: Some Assembly [Still] Required” series. Working locally with one or two services is easy enough with existing tooling and approaches, but in our experience the complexity of orchestration and configuration increases exponentially with the number of services developed unless we utilise the patterns documented above.

In addition to the descriptions of the local development patterns above, I have also included a PDF ‘cheat sheet’ of the details, which can be downloaded below: