Open Credo

23 items found: Search results for "analytics" in all categories x

Data Analytics using Cassandra and Spark

March 23, 2017 | Cassandra, Data Analysis, Data Engineering

Data Analytics using Cassandra and Spark

In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.

However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.

Read More Read More

Building a Google analytics dashboard with Python3, Tornado and deploying it on OpenShift (for free)

August 5, 2015 | Data Analysis, Data Engineering

Building a Google analytics dashboard with Python3, Tornado and deploying it on OpenShift (for free)

A few weeks ago, we thought about building a Google analytics dashboard to give us easy access to certain elements of our Google Analytics web traffic. We saw some custom dashboards for bloggers, but nothing quite right for our goal, since we wanted the data on a big screen for everyone in the office to view.

Read More Read More

Watch ‘Elastic Analytics with Spark, Mesos and Docker’

May 13, 2015 | Software Consultancy

Watch ‘Elastic Analytics with Spark, Mesos and Docker’

Listen to Brenden Matthews discuss Elastic Analytics with Spark, Mesos and Docker as filmed at the most recent London Mesos User Group Meetup.

In this talk, Brenden Matthews discusses how he provided elastic analytics to Airbnb and how the Mesosphere DCOS can easily bring the same type of infrastructure to your own environments.

Read More Read More

Join the London Mesos User Group to find out more about Elastic Analytics
Use Case: A Kafka Backup Solution

February 22, 2024 | Blog, Kafka

Use Case: A Kafka Backup Solution

Check out Peter Vegh’s latest blog where he explores a bespoke Kafka backup framework, discussing the approach, architecture and design process to implement and deliver the solution.

Read More Read More

Let’s Flink on EKS: Data Lake Primer

November 22, 2023 | Blog, Data Analysis

Let’s Flink on EKS: Data Lake Primer

Check out the latest blog by Our Senior Consultant Howard Hill where he offers an engineer’s guide to streamlining real-time data using an open-model infrastructure.

 

Read More Read More

Making Sense of Data with RDF* vs. LPG

January 31, 2022 | Blog, Data Engineering

Making Sense of Data with RDF* vs. LPG

There are two camps of Graph database, one side is RDF, where they are strict with their format, and somewhat limited for their extensibility. The other side is LPG, where they can define labels to the relationships. With its recent extension, RDF now allows users to add properties, thus becoming RDF*. In this blog, Ebru explores the structural and performance differences between LPG and RDF*.

Read More Read More

A Pragmatic Introduction to Machine Learning for DevOps Engineers

January 23, 2018 | Data Engineering, DevOps

A Pragmatic Introduction to Machine Learning for DevOps Engineers

Machine Learning is a hot topic these days, as can be seen from search trends. It was the success of Deepmind and AlphaGo in 2016 that really brought machine learning to the attention of the wider community and the world at large.

Read More Read More

Riak, the Dynamo paper and life beyond Basho

August 8, 2017 | Cassandra

Riak, the Dynamo paper and life beyond Basho

Recently, the sad news has emerged that Basho, which developed the Riak distributed database, has gone into receivership. This would appear to present a problem for those who have adopted the commercial version of the Riak database (Riak KV) supported by Basho.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Detecting stolen AWS credential usage with Apache Spark – Webinar Recording

May 22, 2017 | Data Analysis

Detecting stolen AWS credential usage with Apache Spark – Webinar Recording

As a final piece of our recent blog series about Apache Spark on 16 May we have presented details of a use-case about using Spark Structured Streaming to generate real-time alerts of suspicious activity in an AWS-based infrastructure.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Testing a Spark Application

May 9, 2017 | Cassandra

Testing a Spark Application

Data analytics isn’t a field commonly associated with testing, but there’s no reason we can’t treat it like any other application. Data analytics services are often deployed in production, and production services should be properly tested. This post covers some basic approaches for the testing of Cassandra/Spark code. There will be some code examples, but the focus is on how to structure your code to ensure it is testable!

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Deploy Spark with an Apache Cassandra cluster

May 2, 2017 | Cassandra, Data Engineering

Deploy Spark with an Apache Cassandra cluster

My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.

Read More Read More

New Blog Series: Spark – The Pragmatic Bits

April 25, 2017 | Cassandra, Data Analysis, Data Engineering

New Blog Series: Spark – The Pragmatic Bits

Apache Spark is a powerful open source processing engine which is fast becoming our technology of choice for data analytic projects here at OpenCredo. For many years now we have been helping our clients to practically implement and take advantage of various big data technologies including the like of Apache Cassandra amongst others.

 

 

Read More Read More

The Three ‘R’s of Distributed Event Processing

January 25, 2017 | Cassandra

The Three ‘R’s of Distributed Event Processing

One of the simplest and best-understood models of computation is the Finite State Machine (FSM). An FSM has fixed range of states it can be in, and is always in one of these states. When an input arrives, this triggers a transition in the FSM from its current state to the next state. There may be several possible transitions to several different states, and which transition is chosen depends on the input.

Read More Read More

Fulfilling the promise of Apache Cassandra performance

August 24, 2016 | Cassandra

Fulfilling the promise of Apache Cassandra performance

At OpenCredo we are seeing an increase in adoption of Apache Cassandra as a leading NoSQL database for managing large data volumes, but we have also seen many clients experiencing difficulty converting their high expectations into operational Cassandra performance. Here we present a high-level technical overview of the major strengths and limitations of Cassandra that we have observed over the last few years while helping our clients resolve the real-world issues that they have experienced.

Read More Read More

The Seven Deadly Sins of Microservices (Redux)

January 8, 2016 | Microservices

The Seven Deadly Sins of Microservices (Redux)

Many of our clients are in the process of investigating or implementing ‘microservices’, and a popular question we often get asked is “what’s the most common mistake you see when moving towards a microservice architecture?”. We’ve seen plenty of good things with this architectural pattern, but we have also seen a few recurring issues and anti-patterns, which I’m keen to share here.

Read More Read More

OpenCredo is now an Amazon Web Services APN Consulting Partner
(Spring) Booting Hazelcast

December 1, 2015 | Software Consultancy

(Spring) Booting Hazelcast

This post introduce some of the basic features of Hazelcast, some of its limitations, how to embed it in a Spring Boot application and write integration testings. This post is intended to be the first of a series about Hazelcast and its integration with Spring (Boot). Let’s start from the basics.

Read More Read More

OpenCredo: First DataStax Partner in the UK to achieve Cassandra Certification
How to Write Your Own Apache Mesos Framework?

February 16, 2015 | Software Consultancy

How to Write Your Own Apache Mesos Framework?

Apache Mesos is often explained as being a kernel for the data-centre; meaning that cluster resources (CPU, RAM, …) are tracked and offered to “user space” programs (i.e. frameworks) to do computations on the cluster.

Read More Read More

OpenCredo welcomes our new Digital Architecture Director, Pete Sheffield.
Webinar series on MongoDB within Financial Services
48 hours – The Pursuit of Proper Performance

February 13, 2012 | Software Consultancy

48 hours – The Pursuit of Proper Performance

Recently we were approached by a client to do some performance testing of the web application they had written. The budget allowed two days for this task. Ok. No problem. Yes, we can. Naturally I had one or another question though…

Read More Read More