The partnership incorporated both Discovery and Delivery phases.
We began our work with Kaidee with an investigation into their current data set-up and its scalability. Kaidee were already using Cassandra on a smaller scale, and some employees were familiar with the basics.
As things stood, Kaidee were using rolling window data storage to maintain a relatively stable data volume, discarding individual legacy records. As such, scaling out of the cluster had not been a requirement. Moving forward, Kaidee wanted to use Cassandra to gather more data and automate repetitive tasks, freeing up employees to work on more complex tasks, whilst also reducing errors in the underlying technical solution. There was a further requirement for high performance in terms of response times.
Cassandra is designed to record large amounts of data, is extremely fast and reliable, and has a dependable implementation of distributed counters. Kaidee’s CTO, Mark Hollow, had prior experience using Cassandra at scale, and identified two mandatory requirements:
- Use Case identification: Clearly identify the use cases where Cassandra would add value, and also the ones it was unsuitable for.
- Embedding Cassandra knowledge: Ensure Kaidee’s development and operations teams acquired deep knowledge and hands-on skills, to use Cassandra optimally and avoid common pitfalls.
Use Case identification
Kaidee used Cassandra for a subset of their data storage requirements. We looked at use cases where Cassandra was the best fit, and others where a different storage solution (i.e. a graph database) would be more appropriate.
From this and our discovery work, we established a primary use case: the need to record classified and display advert impressions, as events and counts. This data would be used for live scheduling of sellers’ paid promotions, and tailored delivery of promoted classifieds to buyers.
Embedding Cassandra Knowledge
We ran a tailored five-day skills workshop with Kaidee’s Development and Operations teams. Whilst the broad format and content was agreed on upfront, we allowed flexibility in order to respond to specific learning and business needs as they arose.
The workshops were hands-on for the majority of the time, and covered a breadth of Cassandra-related skills:-
Understanding System Internals:
Beginning with the basics, we explored Cassandra’s strengths and weaknesses, as well as its distributed nature, internal architecture and storage mechanisms. Cassandra is a Java application, and for Kaidee’s engineers (with little to no Java experience) it was important to cover some basics of the Java Virtual Machine.
From this foundation, we were able to help Kaidee deepen their understanding of the Cassandra features of most value to them. We did this by devising a series of exercises designed to validate designs and diagnose operational problems.
Working with Cassandra:
By delving into Cassandra’s memory model, we showed Kaidee how to diagnose some of the most common Cassandra performance problems. We complemented this with hands-on teaching of the most common operational tasks, automation options and less common operational pitfalls.
Responding to Kaidee’s needs, we gave special attention to:
- Manually setting up a multi-node cluster
- Efficient data modelling
- Configuring the application side driver for Cassandra Backing up and restoring data
- Resizing the cluster
- Replacing nodes
- Repairing data for inter-node consistency
We dedicated considerable effort to ensuring Kaidee had the knowledge and confidence to use the tools shipped with Cassandra.
Mastering the Data Modelling Workflow:
Kaidee wanted their teams to be able to develop data models. To facilitate this, we ran through some general data modelling exercises, involving business stakeholders to arrive at solutions that served the business’ needs.