Open Credo

October 23, 2014 | Cassandra

Spring Data Cassandra Overview

Spring Data Cassandra (SDC) is a community project under the Spring Data (SD) umbrella that provides convenient and familiar APIs to work with Apache Cassandra.

WRITTEN BY

Rafal Gancarz

Rafal Gancarz

SDC first stable release was in May 2014 (version 1.0.0) and since then, it has seen a good level of activity with a number of regular releases over the last few months. The project currently supports Cassandra 2.x through the DataStax Java Driver (2.0.x).

Given that SDC builds on top of Spring Data, it is important to have some familiarity with the project. If you are comfortable with Spring Data’s basic concepts you can skip the next section, otherwise, read on!

Spring Data overview

Generally speaking, projects in Spring Data family aim to reduce the amount of boilerplate code required to work with various persistence stores.

One of the core concepts in Spring Data is the repository, which was popularised back in the day through the Domain Driven Design (DDD) software development approach. Spring Data enables developers to define and implement generic repositories backed any of the supported persistent data stores – developers are no more required to implement repositories manually as the appropriate SD module supporting their database of choice can automatically generate repository implementations based on declared or derived queries (find out more about Spring Data).

One other common pattern implemented in most SD modules is data access template, which would be familiar to anyone who’s used spring-jdbc and spring-orm. These templates typically simplify the execution of queries and operations against their associated data store.

Features of Spring Data Cassandra

Spring Data Cassandra brings features that are now familiar to developers using core Spring as well as other Spring Data modules:

  • JavaConfig and XML based configuration options to configure Cluster and Session objects.
  • Automatic repository implementation with the ability to add custom methods
  • Exception translation to Spring’s DataAccessException hierarchy

Additionally, there are number of features specific to working with Cassandra:

  • Keyspace and table creation support
  • Support for synchronous and asynchronous operations (with callbacks)
  • CQLTemplate abstraction for CQL query level operations
  • Boilerplate-free POJO-style access to Cassandra through CassandraTemplate
  • Support for CQL Java DSL

Setting the stage

For our example, let’s imagine that we are using Cassandra to store time-series data that represent some generic system events, each consisting of a time based id, a type, a collection of arbitrary tags, as well as a time bucket attribute, which is a common technique used to efficiently store time-series data in Cassandra.

For our purposes, we are assuming that we are interested in querying events by ‘type’ and ‘time bucket’, and therefore we will use both attributes as a composite partition key, whereas the event id will serve as a clustering key (we are opting for descending clustering order to enable efficient retrieval of most recent events).

Setting up Cassandra

I’m assuming you have a running Cassandra server/cluster already. If not, you can follow the Cassandra getting started tutorial to set that up.

Using cqlsh, create a keyspace called events and table called event:

CREATE KEYSPACE events WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
USE events;
CREATE TABLE event (
  type text,
  bucket text,
  id timeuuid,
  tags set,
  PRIMARY KEY ((type, bucket), id)
) WITH CLUSTERING ORDER BY (id DESC);

Configuring access to Cassandra in your project

First, you need to include the Spring Data Cassandra dependency in your project, using your prefered build tool. At the time of writing it was org.springframework.data:spring-data-cassandra:1.1.0.RELEASE.

Now, you need to make sure that you have an appropriate cassandra.properties file on the classpath:

cassandra.contactpoints=127.0.0.1
cassandra.port=9042
cassandra.keyspace=events

In this example, we will use JavaConfig. You need to create a @Configuration class for setting up Spring beans for Cluster and Session instances. Spring Data Cassandra provides an AbstractCassandraConfiguration base class to reduce the configuration code needed:

@Configuration
@PropertySource(value = { "classpath:cassandra.properties" })
@EnableCassandraRepositories(basePackages = { "example" })
public class CassandraConfiguration extends AbstractCassandraConfiguration {

    @Autowired
    private Environment environment;

    @Bean
    public CassandraClusterFactoryBean cluster() {
        CassandraClusterFactoryBean cluster = new CassandraClusterFactoryBean();
        cluster.setContactPoints(environment.getProperty("cassandra.contactpoints"));
        cluster.setPort(Integer.parseInt(environment.getProperty("cassandra.port")));
        return cluster;
    }

    @Override
    protected String getKeyspaceName() {
        return environment.getProperty("cassandra.keyspace");
    }

    @Bean
    public CassandraMappingContext cassandraMapping() throws ClassNotFoundException {
        return new BasicCassandraMappingContext();
    }
}

CQL support with CqlTemplate

Spring Data Cassandra offers two types of data access templates. The lower level one, CqlTemplate (implements CqlOperations) wraps Cassandra Session and conveniently exposes operations to execute CQL statements. Usage patterns are very similar to those of JdbcTemplate, so if you have prior experience with JDBC you should feel right at home.

CqlTemplate supports both synchronous and asynchronous statement execution and works well with plain CQL statements, prepared statements and statements built using query building APIs available through the DataStax Java driver for Cassandra.

Here’s an example:

CqlOperations cqlTemplate = ...

cqlTemplate.execute("insert into event (id, type, bucket, tags) values (" + UUIDs.timeBased() + ", 'type1', '2014-01-01', {'tag2', 'tag3'})");

Insert insert1 = QueryBuilder.insertInto("event").value("id", UUIDs.timeBased())
.value("type", "type2").value("bucket", "2014-01-01").value("tags", ImmutableSet.of("tag1"));
cqlTemplate.execute(insert1);

Statement insert2 = cqlTemplate.getSession().prepare("insert into event (id, type, bucket, tags) values (?, ?, ?, ?)").bind(UUIDs.timeBased(), "type2", "2014-01-01", ImmutableSet.of("tag1", "tag2"));
cqlTemplate.execute(insert2);

ResultSet rs1 = cqlTemplate.query("select * from event where type='type2' and bucket='2014-01-01'");

Select select = QueryBuilder.select().from("event").where(QueryBuilder.eq("type", "type1")).and(QueryBuilder.eq("bucket", "2014-01-01")).limit(10);
ResultSet rs2 = cqlTemplate.query(select);

POJO support with Cassandra Template

If you prefer to work with POJOs, CassandraTemplate offers just that by building on top of the basic CQL capabilities provided by CqlTemplate – it gives you the ability to work with Java objects while SDC takes care of query building and object mapping for you.

To take advantage of this feature your POJOs will have to be annotated to help SDC work out how to map objects to queries and query results back to objects. Continuing our example, let’s create a simple annotated Event class:

@Table
public class Event {

    @PrimaryKeyColumn(name = "id", ordinal = 2, type = PrimaryKeyType.CLUSTERED, ordering = Ordering.DESCENDING)
    private UUID id;
    @PrimaryKeyColumn(name = "type", ordinal = 0, type = PrimaryKeyType.PARTITIONED)
    private String type;
    @PrimaryKeyColumn(name = "bucket", ordinal = 1, type = PrimaryKeyType.PARTITIONED)
    private String bucket;
    @Column
    private Set tags = new HashSet();

    public Event(UUID id, String type, String bucket, Set tags) {
        this.id = id;
        this.type = type;
        this.bucket = bucket;
        this.tags.addAll(tags);
    }

    public UUID getId() {
        return id;
    }

    public String getType() {
        return type;
    }

    public String getBucket() {
        return bucket;
    }

    public Set getTags() {
        return tags;
    }
}

As you can see, SDC provides annotations to express class member to column mappings and conversions. It also supports working with collection columns and composite keys.

Now we can leverage the full power of CassandraTemplate:

CassandraOperations cassandraTemplate = ...

cassandraTemplate.insert(new Event(UUIDs.timeBased(), "type3", "2014-01-01", ImmutableSet.of("tag1", "tag3")));

Event oneEvent = cassandraTemplate.selectOne(select, Event.class);
List moreEvents = cassandraTemplate.select(select, Event.class);

CassandraTemplate provides means to execute synchronous and asynchronous operations for insertion, modification and deletion of rows as well querying based on plain CQL (with result set mapping), or select queries built using QueryBuilder.

Cassandra repositories

The last feature we will cover is the support for generic Spring Data repositories backed by Cassandra as a data store. Creating Cassandra backed repositories isn’t at all different from their JPA or MongoDB counterparts for example, if you already have experience with other SD modules.

First, you need your own repository interface, and the easiest way to go about creating one is to extend CassandraRepository with your desired entity type:

public interface EventRepository extends CassandraRepository {

    @Query("select * from event where type = ?0 and bucket=?1")
    Iterable findByTypeAndBucket(String type, String bucket);
}

SDC will generate for you a concrete implementation if you tell it where to find the interface definition. This is the reason why we annotated CassandraConfiguration class with @EnableCassandraRepositories and specified the base package to use for repository interface scanning.

However, SDC currently only supports associating parametrised queries with user-defined repository operations (as you can see in EventRepository), but in the future the plan is to add support for deriving queries from repository method names, in a similar fashion to the JPA Spring Data module.

After all this hard work you can start using your newly defined repository:

EventRepository repository = ...

repository.save(new Event(UUIDs.timeBased(), "type1", "2014-01-01", ImmutableSet.of("tag1", "tag2")));

Iterable moreEvents = repository.findByTypeAndBucket("type1", "2014-01-01");

Summary

All in all Spring Data Cassandra simplifies working with Cassandra massively, especially when using abstractions such as CassandraTemplate or repository support.

Unfortunately, at the moment some other capabilities, that are admittedly less essential, are not supported by SDC (although some are provisionally included in the upcoming 2.0.0 release). These include:

  • working with POJO class hierarchies
  • custom field mappings
  • pagination
  • auditing

It would also be great to add some Spring Boot integration for Cassandra to enable auto-configuration of SDC beans and inclusion of Cassandra configuration options in the default application.properties config.

Resources

RETURN TO BLOG

SHARE

Twitter LinkedIn Facebook Email

SIMILAR POSTS

Blog