October 23, 2014 | Cassandra
Spring Data Cassandra (SDC) is a community project under the Spring Data (SD) umbrella that provides convenient and familiar APIs to work with Apache Cassandra.
WRITTEN BY
SDC first stable release was in May 2014 (version 1.0.0) and since then, it has seen a good level of activity with a number of regular releases over the last few months. The project currently supports Cassandra 2.x through the DataStax Java Driver (2.0.x).
Given that SDC builds on top of Spring Data, it is important to have some familiarity with the project. If you are comfortable with Spring Data’s basic concepts you can skip the next section, otherwise, read on!
Generally speaking, projects in Spring Data family aim to reduce the amount of boilerplate code required to work with various persistence stores.
One of the core concepts in Spring Data is the repository, which was popularised back in the day through the Domain Driven Design (DDD) software development approach. Spring Data enables developers to define and implement generic repositories backed any of the supported persistent data stores – developers are no more required to implement repositories manually as the appropriate SD module supporting their database of choice can automatically generate repository implementations based on declared or derived queries (find out more about Spring Data).
One other common pattern implemented in most SD modules is data access template, which would be familiar to anyone who’s used spring-jdbc and spring-orm. These templates typically simplify the execution of queries and operations against their associated data store.
Spring Data Cassandra brings features that are now familiar to developers using core Spring as well as other Spring Data modules:
DataAccessException
hierarchyAdditionally, there are number of features specific to working with Cassandra:
CQLTemplate
abstraction for CQL query level operationsCassandraTemplate
For our example, let’s imagine that we are using Cassandra to store time-series data that represent some generic system events, each consisting of a time based id, a type, a collection of arbitrary tags, as well as a time bucket attribute, which is a common technique used to efficiently store time-series data in Cassandra.
For our purposes, we are assuming that we are interested in querying events by ‘type’ and ‘time bucket’, and therefore we will use both attributes as a composite partition key, whereas the event id will serve as a clustering key (we are opting for descending clustering order to enable efficient retrieval of most recent events).
I’m assuming you have a running Cassandra server/cluster already. If not, you can follow the Cassandra getting started tutorial to set that up.
Using cqlsh
, create a keyspace called events
and table called event
:
CREATE KEYSPACE events WITH replication = {'class':'SimpleStrategy', 'replication_factor':1}; USE events; CREATE TABLE event ( type text, bucket text, id timeuuid, tags set, PRIMARY KEY ((type, bucket), id) ) WITH CLUSTERING ORDER BY (id DESC);
First, you need to include the Spring Data Cassandra dependency in your project, using your prefered build tool. At the time of writing it was org.springframework.data:spring-data-cassandra:1.1.0.RELEASE
.
Now, you need to make sure that you have an appropriate cassandra.properties file on the classpath:
cassandra.contactpoints=127.0.0.1 cassandra.port=9042 cassandra.keyspace=events
In this example, we will use JavaConfig. You need to create a @Configuration
class for setting up Spring beans for Cluster
and Session
instances. Spring Data Cassandra provides an AbstractCassandraConfiguration
base class to reduce the configuration code needed:
@Configuration
@PropertySource(value = { "classpath:cassandra.properties" })
@EnableCassandraRepositories(basePackages = { "example" })
public class CassandraConfiguration extends AbstractCassandraConfiguration {
@Autowired
private Environment environment;
@Bean
public CassandraClusterFactoryBean cluster() {
CassandraClusterFactoryBean cluster = new CassandraClusterFactoryBean();
cluster.setContactPoints(environment.getProperty("cassandra.contactpoints"));
cluster.setPort(Integer.parseInt(environment.getProperty("cassandra.port")));
return cluster;
}
@Override
protected String getKeyspaceName() {
return environment.getProperty("cassandra.keyspace");
}
@Bean
public CassandraMappingContext cassandraMapping() throws ClassNotFoundException {
return new BasicCassandraMappingContext();
}
}
Spring Data Cassandra offers two types of data access templates. The lower level one, CqlTemplate
(implements CqlOperations
) wraps Cassandra Session
and conveniently exposes operations to execute CQL statements. Usage patterns are very similar to those of JdbcTemplate
, so if you have prior experience with JDBC you should feel right at home.
CqlTemplate
supports both synchronous and asynchronous statement execution and works well with plain CQL statements, prepared statements and statements built using query building APIs available through the DataStax Java driver for Cassandra.
Here’s an example:
CqlOperations cqlTemplate = ...
cqlTemplate.execute("insert into event (id, type, bucket, tags) values (" + UUIDs.timeBased() + ", 'type1', '2014-01-01', {'tag2', 'tag3'})");
Insert insert1 = QueryBuilder.insertInto("event").value("id", UUIDs.timeBased())
.value("type", "type2").value("bucket", "2014-01-01").value("tags", ImmutableSet.of("tag1"));
cqlTemplate.execute(insert1);
Statement insert2 = cqlTemplate.getSession().prepare("insert into event (id, type, bucket, tags) values (?, ?, ?, ?)").bind(UUIDs.timeBased(), "type2", "2014-01-01", ImmutableSet.of("tag1", "tag2"));
cqlTemplate.execute(insert2);
ResultSet rs1 = cqlTemplate.query("select * from event where type='type2' and bucket='2014-01-01'");
Select select = QueryBuilder.select().from("event").where(QueryBuilder.eq("type", "type1")).and(QueryBuilder.eq("bucket", "2014-01-01")).limit(10);
ResultSet rs2 = cqlTemplate.query(select);
If you prefer to work with POJOs, CassandraTemplate
offers just that by building on top of the basic CQL capabilities provided by CqlTemplate
– it gives you the ability to work with Java objects while SDC takes care of query building and object mapping for you.
To take advantage of this feature your POJOs will have to be annotated to help SDC work out how to map objects to queries and query results back to objects. Continuing our example, let’s create a simple annotated Event class:
@Table
public class Event {
@PrimaryKeyColumn(name = "id", ordinal = 2, type = PrimaryKeyType.CLUSTERED, ordering = Ordering.DESCENDING)
private UUID id;
@PrimaryKeyColumn(name = "type", ordinal = 0, type = PrimaryKeyType.PARTITIONED)
private String type;
@PrimaryKeyColumn(name = "bucket", ordinal = 1, type = PrimaryKeyType.PARTITIONED)
private String bucket;
@Column
private Set tags = new HashSet();
public Event(UUID id, String type, String bucket, Set tags) {
this.id = id;
this.type = type;
this.bucket = bucket;
this.tags.addAll(tags);
}
public UUID getId() {
return id;
}
public String getType() {
return type;
}
public String getBucket() {
return bucket;
}
public Set getTags() {
return tags;
}
}
As you can see, SDC provides annotations to express class member to column mappings and conversions. It also supports working with collection columns and composite keys.
Now we can leverage the full power of CassandraTemplate
:
CassandraOperations cassandraTemplate = ...
cassandraTemplate.insert(new Event(UUIDs.timeBased(), "type3", "2014-01-01", ImmutableSet.of("tag1", "tag3")));
Event oneEvent = cassandraTemplate.selectOne(select, Event.class);
List moreEvents = cassandraTemplate.select(select, Event.class);
CassandraTemplate
provides means to execute synchronous and asynchronous operations for insertion, modification and deletion of rows as well querying based on plain CQL (with result set mapping), or select queries built using QueryBuilder
.
The last feature we will cover is the support for generic Spring Data repositories backed by Cassandra as a data store. Creating Cassandra backed repositories isn’t at all different from their JPA or MongoDB counterparts for example, if you already have experience with other SD modules.
First, you need your own repository interface, and the easiest way to go about creating one is to extend CassandraRepository
with your desired entity type:
public interface EventRepository extends CassandraRepository {
@Query("select * from event where type = ?0 and bucket=?1")
Iterable findByTypeAndBucket(String type, String bucket);
}
SDC will generate for you a concrete implementation if you tell it where to find the interface definition. This is the reason why we annotated CassandraConfiguration
class with @EnableCassandraRepositories
and specified the base package to use for repository interface scanning.
However, SDC currently only supports associating parametrised queries with user-defined repository operations (as you can see in EventRepository
), but in the future the plan is to add support for deriving queries from repository method names, in a similar fashion to the JPA Spring Data module.
After all this hard work you can start using your newly defined repository:
EventRepository repository = ...
repository.save(new Event(UUIDs.timeBased(), "type1", "2014-01-01", ImmutableSet.of("tag1", "tag2")));
Iterable moreEvents = repository.findByTypeAndBucket("type1", "2014-01-01");
All in all Spring Data Cassandra simplifies working with Cassandra massively, especially when using abstractions such as CassandraTemplate
or repository support.
Unfortunately, at the moment some other capabilities, that are admittedly less essential, are not supported by SDC (although some are provisionally included in the upcoming 2.0.0 release). These include:
It would also be great to add some Spring Boot integration for Cassandra to enable auto-configuration of SDC beans and inclusion of Cassandra configuration options in the default application.properties
config.
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.