Open Credo

Showing all posts by David Borsos

CockroachDB: First Impressions

June 15, 2017 | Data Engineering

CockroachDB: First Impressions

CockroachDB is a distributed SQL (“NewSQL”) database developed by Cockroach Labs and has recently reached a major milestone: the first production-ready, 1.0 release. We at OpenCredo have been following the progress of CockroachDB for a while, and we think it’s a technology of great potential to become the go-to solution for a having a general-purpose database in the cloud.

Read More Read More

Deploy Spark with an Apache Cassandra cluster

May 2, 2017 | Cassandra, Data Engineering

Deploy Spark with an Apache Cassandra cluster

My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.

Read More Read More

Data Analytics using Cassandra and Spark

March 23, 2017 | Cassandra, Data Analysis, Data Engineering

Data Analytics using Cassandra and Spark

In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.

However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.

Read More Read More

Google Cloud Spanner: our first impressions

March 7, 2017 | Data Analysis, GCP

Google Cloud Spanner: our first impressions

Google has recently made its internal Spanner database available to the wider public, as a hosted solution on Google Cloud. This is a distributed relational/transactional database used inside for various Google projects (including F1, the advertising backend), promising high throughput, low latency and 99.999% availability. As such it is an interesting alternative to many open source or other hosted solutions. This whitepaper gives a good theoretical introduction into Spanner.

Read More Read More

Everything you need to know about Cassandra Materialized Views

February 16, 2017 | Cassandra

Everything you need to know about Cassandra Materialized Views

One of the default Cassandra strategies to deal with more sophisticated queries is to create CQL tables that contain the data in a structure that matches the query itself (denormalization). Cassandra 3.0 introduces a new CQL feature, Materialized Views which captures this concept as a first-class construct.

Read More Read More

Versioning a Microservice System with git

March 2, 2016 | Microservices

Versioning a Microservice System with git

Microservice-style software architectures have many benefits: loose coupling, independent scalability, localised failures, facilitating the usage of polyglot data persistence tools or multiple programming languages.

However, they also introduce other challenges. A major one is the fact that the end-user functionality of the system will ultimately emerge as a composition of multiple services. This significantly increases the complexity of deploying the system. In addition, because we lose the concept of “versions” of the system, it becomes harder to answer questions like “what capabilities are in production?” and “when is a new feature considered ‘done’?”.

Read More Read More

Experiences with Spring Boot

February 24, 2014 | Cloud Native, Microservices

Experiences with Spring Boot

Last year some of us attended the London Spring eXchange where we encountered a new and interesting tool that Pivotal was working on: Spring Boot. Since then we had the opportunity to see what it’s capable of in a live project and we were deeply impressed.

Read More Read More

New features in Cassandra 2.0 – Lightweight Transactions on Update

February 17, 2014 | Cassandra

New features in Cassandra 2.0 – Lightweight Transactions on Update

In our previous posts we gave an overview of Cassandra’s new compare-and-set (lightweight transaction) commands and a more detailed look into the API for using them when inserting new rows into the database.

In this third post, we are going to cover update statements. We recommend reading the previous posts, as there are some details which are the same for inserts and updates which are not repeated here.

Read More Read More

New features in Cassandra 2.0 – Lightweight Transactions on Insert

January 6, 2014 | Cassandra

New features in Cassandra 2.0 – Lightweight Transactions on Insert

The team over at Cucumber Pro recently posted a sneak peek on their blog, demonstrating some key features of their offering.

As more of a technical user of Cucumber, there isn’t much that’s new or ground-breaking for me – almost every feature is already available through your preferred IDE combined with a few plugins.

Read More Read More

New features in Cassandra 2.0 – More on Lightweight Transactions

December 2, 2013 | Cassandra

New features in Cassandra 2.0 – More on Lightweight Transactions

Perhaps the most important of Cassandra’s selling points is its completely distributed architecture and its ability to easily extend the cluster with virtually any number of nodes. Implementing a classical RDBMS-style transaction consisting of “put locks on the database, modify the data, then commit the transaction”-style operations are simply not feasible in such an architecture (i.e. that doesn’t scale well).

Read More Read More

New features in Cassandra 2.0 – Lightweight Transactions and Triggers

November 14, 2013 | Cassandra

New features in Cassandra 2.0 – Lightweight Transactions and Triggers

Cassandra 2.0 was released in early September this year and came with some interesting new features, including “lightweight transactions” and triggers.

Despite the rising interest in the various non-relational databases in recent years, there are still numerous use-cases for which a relational database system is a better choice. The latest major release of Cassandra (version 2.0) provides some interesting features that aim to close this gap, and offers its fast and distributed storage engine enhanced with new options that will make users’ lives easier.

Read More Read More