Open Credo

August 2, 2013 | Software Consultancy

Configuration Management with Flexible Contexts

Configuration management was born in the pre-cloud era. Remember the days when acquiring a super powerful multi core server felt like winning the jackpot? Infrastructure was a slightly different place back then. Yet for all the recent developments in DevOps, its legacy is still with us.

WRITTEN BY

Maartens Lourens

Maartens Lourens

Configuration Management with Flexible Contexts


In this blog post I want to highlight this legacy’s influence on one aspect of configuration management. In particular, I want to address the assumption made by most configuration management tools that the host is the central context of configuration. To this end I use Salt to illustrate the concept and demonstrate where it creates issues, and offer some ways to a solution.

The issue first became apparent to me during a recent project on which I had the good fortune to use Salt while automating cloud infrastructure. Work on the project started off quickly and reinforced my impression that Salt is indeed very powerful and easy to use. Yaml is dead simple to write, and once I got to grips with the requirements, I could design new infrastructure about as fast as I could type.

Soon we had core services up and running, secure in the knowledge that they can be rebuilt with little effort. With my belief in Salt’s capabilities growing and enthusiasm to match it, I decided to turn my attention to the niggling problem of the deployment pipeline.

The client’s deployment requirements were not untypical. According to the cloud design each VM group (or vApp in VMWare-speak) would have its own message broker, application server, and a clutch of standalone Java apps. So apart from the database slice running on a single shared server, each VM group would be a self-contained ecosystem. In this ecosystem, the apps had a functional relationship with other apps, as well as with the message broker. The Java properties file, in particular, expressed these dependencies.

So far so good. But how could I get Salt to resolve those relationships at deploy time? That proved harder than I expected. Salt knows about the node it is deploying to, but not about the VM group the node belongs to. Such an awareness is not inherent in the Salt framework (although the addition of salt-cloud has brought that capability closer).

To clarify, suppose there are two machines in a VM group. Let’s call the group test-11, and the two machines test-app-11 and test-broker-11.

We plan to deploy a Java properties file on test-app-11. We want it to have the following property and value:

  messagebroker: test-broker-11

As our VM environments are elastic, we can have any number of VM groups in existence. So we definitely don’t want to hardcode the hostnames in the template. For instance when we spin up test-12 with test-app-12 and test-broker-12 the property will have to be:

  messagebroker: test-broker-12

If Salt had the concept of a VM group and its internal relationships the solution would be pretty straightforward. We would be able to describe the state as follows:

  /opt/appserver/conf/app.properties:  
   file.managed:
    - source: salt://appserver/app.properties  
    - template: jinja  
    - user: appuser  
    - group: appuser  
    - mode: 644

And define the template as:

  messagebroker: {{ grains['broker'] }}

But unfortunately that is not currently possible. Like Puppet’s factor and Chef’s ohai, the Salt grains system is entirely host based. We can use {{ grains[‘host’] }} to refer to the host itself, and {{ salt[‘grains.get’](‘ip_interfaces:eth0’)[0] }} to get the host’s IP, and do any manner of grains lookups for host-related information; but any concept of other VMs in the host’s VM group is out of scope.

What to do?

Well, there is Pillar. Pillar stores global variables, yet is also much more. It is essentially an interface for global values. That means it allows for various external pillar interfaces – both out of the box and via plugins – including hiera, mongodb, and LDAP. I will come back to this, but the immediate question is: can Pillar help us define relationships in a group of VMs so that we can reference them in templates?

Let’s see. Pillar is expressed in yaml, so a simple approach might expressing the relationship as follows:

  test-app-11:  
   - test-broker-11  
  test-app-12:  
   - test-broker-12

However it creates the problem of having to know exactly which key to reference in the properties file template. In the present case it could be any of the apps. You would have to provide all the options explicitly:

  {% if grains['host'] == “test-app-11″ %}  
  messagebroker: {{ pillar['test-app-11'] }}  
  {% elif grains['host'] == “test-app-12″  
  messagebroker: {{ pillar['test-app-12'] }}  
  {% elif …  

It would work, but only just. It is an old school approach that clutters the template and gives the interpreter unnecessary work to do. Besides, it assumes you know in advance what all your VM groups are going to be. That might be the case in this instance, but there could be cases where the relationships are not one-to-one. To make matters worse, imagine if you are a hosting provider and had a thousand VM groups. You would need two thousand and one lines just to decide the name of the broker! At that point, the pillar data is redundant too. You might as well go back to hard coding the broker name:

  {% if grains['host'] == “test-app-11″ %}  
  messagebroker: {{ pillar['test-app-11'] }}  
  {% elif grains['host'] == “test-app-12″  
  messagebroker: {{ pillar['test-app-12'] }}  
  {% elif …  

So no, this approach has too many drawbacks. It just won’t do in an elastic cloud environment.

A similar solution, even more unworkable, is to create a unique template for each possible test-app-xx. You would end up with a thousand files when you wanted only one.

A better solution is to choose a predictable server naming convention and leverage it with simple variable interpolation. In that case we could do without any pillar data and simply add the following to our template:

  # grab the last two digits of any hostname test-app-XX  
  {% set vmgroup = grains['host']|truncate(9,True,”) %}  
  .  
  .  
  messagebroker: test-broker-{{ vmgroup }}  

There is a lot to be said for this solution. It is succinct and for the maintainer’s convenience keeps all the references in one place, namely the template. No skipping back and forth between template and pillar data. If you have end-to-end control over your VM environments, and can decide the naming conventions, this might just work for you.

Unfortunately, there are many situations where this type of naming predictability falls apart. A typical case would be an incrementing id added to a basename for the name of every new server added to the infrastructure, irrespective of its grouping. So a VM group might consist of test-app-01, test-app-03, test-broker-09, and test-web-14 (or worse: vm0001, vm0003, vm0009. vm0014). They may be in the same VM group, but there’s nothing in the names to suggest what their functions are or that they belong together.

When we look at the problem in this light it becomes clear that what we are really looking for is (1) exposed metadata regarding the grouping of VMs and (2) generated metadata regarding the functional identities of those VMs. Since we decide the functional identities when we apply configuration management, it makes sense that we should provide it, rather than look to the cloud provider for this information. The grouping of VMs will be available at the time that the VMs are created in its group (eg. the vApp), either via the cloud API, or more directly if the setup is done manually by an administrator. In theory we drive both types of metadata, and there is no reason why they can’t be kept in the same database, as long as the necessary tools on either side have access to them.

Let’s consider two different cases. The first case is where we want configuration management, but a fully automated cloud management framework is absent. A sys admin will click through the steps in the cloud provider UI to create the VM groups and the vanilla VMs (eg. from a base image, or as a network booted installation). Once the VMs are created they will be made visible to the configuration management system, which takes it from there.

In the second case the cloud management framework will be given a request, perhaps via a web service, and the end to end creation of VMs through to configuration will be taken care of in one go.

The first case pertains to our original question regarding the metadata, limited in its use to Salt’s domain. The question remains, and I pose it again: Can we provide Salt with information regarding the relationships between VMs in a VM group?

The answer is: yes.

With Pillar’s key-value structure you can create an abstract tree that describes various characteristics of the whole VM group, with the VM group name as its root node. Since Salt’s grains subsystem knows the host being deployed to, this can be used to bootstrap the Pillar VM group data structure. All that remains is to create a separate Pillar key with the individual VM linked to the VM group.

To illustrate, suppose we have a VM group called test-76, with VMs test-app-01, test-app-03, test-broker-09, and test-web-14. We can describe some of this VM group’s characteristics in a Pillar yaml as follows:

  test-76:  
   messagebroker: test-broker-09  
   webapps:  
    - test-app-01  
    - test-app-03  
   webserver: test-web-14  
  test-broker-09:  
   VMgroup: test-76  
  test-app-01:  
   vmgroup: test-76  
  test-app-03:  
   vmgroup: test-76  
  test-web-14:  
   vmgroup: test-76  
   Referencing any VM relationship becomes a doddle. Our use case is solved as follows:  
  {% set vmgroup = grains['host']['vmgroup'] %}  
  .  
  .  
  messagebroker: {{ salt['pillar.get'](‘vmgroup:messagebroker’) }}

Finally!

Each VM group still needs its own data structure though, and this is a maintenance burden that we need to address.

Fortunately, there is light at the end of the tunnel. By introducing a cloud management framework, which manages the end to end process from vm creation through to configuration management and (in our example at least) application deployment, we open the possibility of managing this metadata as part of an automated process.

An important design decision to be made is choosing a data provider that both Salt and the cloud automation framework can talk to. As Pillar already supports reading from external databases such as mongodb, hiera and LDAP, they might be a good place to start looking. The details of implementing such an endeavour would, however, have to wait for a future blog post.

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

RETURN TO BLOG

SHARE

Twitter LinkedIn Facebook Email

SIMILAR POSTS

Blog