October 31, 2015 | Microservices
Over the past few weeks I’ve been writing an OpenCredo blog series on the topic of “Building a Microservice Development Ecosystem”, but my JavaOne talk of the same title crept up on me before I managed to finish the remaining posts. I’m still planning to finish the full blog series, but in the meantime I thought it would be beneficial to share the video and slides associated with the talk, alongside some of my related thinking. I’ve been fortunate to work on several interesting microservice projects at OpenCredo, and we’re always keen to share our knowledge or offer advice, and so please do get in touch if we can help you or your organisation.
When I talk about a “development ecosystem”, I am typically referring to what it takes to get code developed locally into production, and ultimately adding value to users of the system. This is more than just a build pipeline (as popularised by Jez Humble’s and Dave Farley’s great Continuous Delivery book), and for me a development ecosystem comprises of local and integration build, test, deploy, operate and observe. Let’s break this down further…
Building is a vital part of any software development project, and I believe this starts with getting a good local development configured and ends with creating a rugged integration build pipeline established. Creating a standardised, reliable and repeatable local development environment is especially important when working with microservices, as there are typically more moving parts (services) in comparison with working on a monolith, and there may be multiple languages, platforms and datastores in play. You seriously don’t want to be hand-rolling development machine configuration.
Having stressed the importance of creating a solid local development environment, I won’t cover the topic in much depth within this article, as I’ve previously written at length about this on another OpenCredo blog post “Working Locally with Microservices”. The topic of creating a build pipeline is also well-covered in the aforementioned ‘Continuous Delivery’ book, and so I recommend reading this if you haven’t already. However, the challenge of creating a multi-service multi-pipeline is something vital for the testing of microservice code on it’s way to production. Let’s look at the importance of testing.
I started this section of my JavaOne talk by reminding the audience of the ever-popular (and very valuable) testing pyramid. The principles behind the pyramid obviously still apply when testing a microservice-based system, but often the systems/applications/services-under-test look a little different (and accordingly, so do the tests). Toby Clemson’s excellent article on microservice testing strategies that was posted on Martin Fowler’s blog is my go-to introductory reference to this topic.
In regards to the multi-service/multi-pipeline issue I mentioned above, I strongly caution against the use of service versioning to indicate which services will work with others. For anyone familiar with Java development, the use of artefact versioning (for example, within the ubiquitous Maven build and dependency management tool) is second nature. I believe this is best practice when building a modular monolith, as this ensures that anyone can pick up the code and resulting dependency descriptor (e.g. the pom.xml) and they are good to go. However, for loosely-coupled microservice systems, this notion of ‘rubber stamping’ versions of compatible service doesn’t scale, and also increases explicit coupling – it’s very easy to create a ‘distributed monolith’ (trust me, I’ve done this).
More effective techniques for testing compatibility between services include critical path testing (otherwise known as synthetic transactions or semantic monitoring) within QA, staging, and ultimately production. Consumer-driven contracts are a great approach for asserting both the interface and behaviour of a service, and we have found Pact-JVM (combined with Pact Broker) a very useful set of tools. If you are using Spring Boot, then Matt Stine has put together a very nice demonstration of the use of Pact JVM that is available within his Github account.
My personal opinions for testing with a microservice-based application are as follows:
My final suggestions for testing include the ‘ilities’, such as security, reliability, performance and scalability. The ZAP security tooling from the awesome OWASP team comes highly recommended (combined with the ‘bdd-security’ framework), and I also suggest the use of Apache Jmeter and the Jenkins Performance Plugin for load testing everything from individual services (typically the happy paths, in order to watch for performance regressions), and also the system as a whole.
On the topic of deployment I recommend the use of continuous deployment to production, with new or not-yet-ready functionality being hidden by feature flags. This is not particularly new advice (and Etsy have been pushing the benefits of this for years), but I would suggest that feature-flagging be implemented at the ingress-level for microservices i.e. the system boundary for user interaction or the internal module responsible for initiating action (i.e. a cron-like system). It is all too tempting to spread related feature flags throughout multiple services, and sometimes this is essential, but in the general case, switching some feature on/off in one location during the usage journey is much easier.
I also advocate for incremental rollout of new service versions (with close observation of related metrics), and the use of canarying and blue/green deployment techniques can be valuable. The final piece of advice is to avoid using datastore migration tooling (such as Liquibase and Flyway) to make non-backward compatible schema changes if at all possible, as this can lead to breakages during deployment.
The general advice with microservice data stores is to have one store per service, but in reality (for performance or reporting reasons) it can often be the case that necessitates only one class/type of services writing to a store, but many undertaking read-only operations. Non-backwards compatible changes will not only break the existing older instances of a service during deployment of the new service version (and hence prevent incremental upgrades), but it may also break the services performing read-only action.
My recommendations for operating microservices from a development perspective include standardising on an OS (e.g. Ubuntu, RHEL, CoreOS, RancherOS etc) across all environments in order to minimise deviation between development and production environments; utilising programmable infrastructure tooling that allows cross-vendor initialisation and destruction of entire environment stacks, such as HashiCorp’s Terraform; and also the use of configuration management tooling to manage lower-level instance and service configuration, for example, Chef, Ansible, Puppet, or SaltStack (“CAPS” tooling).
In my JavaOne talk I also discussed the choice between external versus client-side service discovery and load-balancing, and how this relates to centralised configuration (and tooling such as Consul, etcd and ZooKeeper). There is some interesting work going on in this space, including srv-router and Baker Street.
I echo the sentiments of many industry luminaries, in that something should not be considered successfully deployed to production unless it is fully monitored. Once basic monitoring is complete, for example using the health check and metric endpoint functionality provided by Coda Hale’s (DropWizard) Metrics library or the Spring Boot actuator, I then advocate for exposing service functionality (business) specific metrics that will indicate service and system health. Examples of such metrics include assertions on minimum incoming message queue lengths, the latency of a dependent third-party downstream service, or the average number of ecommerce shop checkouts per minute.
I also repeated my ‘log like an operator’ mantra that I am often recommending, and recommended some good resources to help developers write log statements that are useful for everyone from fellow developers, QA specialists and operators. Tooling in this space was recommended, including the ubiquitous ElasticSearch-Logstash-Kibana (ELK) stack, Zipkin for distributed (correlated request) tracing, and InfluxDB, Telegraf and Grafana for metric capture and display.
The final recommendations for operating microservices included implementing a good approach to alerting (a’la Rob Ewaschuk’s “Philosophy on Alerting”), and developing strategies and tactics for fixing problems, such as those provided by Brendan Gregg’s USE method, and Kyle Rankin’s very useful “DevOps Troubleshooting” book.
Below is a link to the YouTube recording of the talk. The video includes all of the talks that took place in the room that day, and so be careful when scrubbing the timeline (unless you would like to watch the other excellent talks)!
I have uploaded the version of the talk presented at JavaOne to my SlideShare account:
As usual, we’re always keen to receive feedback and comments at OpenCredo, and also to discuss any issues you may be having at your organisation. Please feel free to drop me a line on twitter @danielbryantuk or email email@example.com