Open Credo

40 items found: Search results for "storage" in all categories x

Use Case: A Kafka Backup Solution

February 22, 2024 | Blog, Kafka

Use Case: A Kafka Backup Solution

Check out Peter Vegh’s latest blog where he explores a bespoke Kafka backup framework, discussing the approach, architecture and design process to implement and deliver the solution.

Read More Read More

Let’s Flink on EKS: Data Lake Primer

November 22, 2023 | Blog, Data Analysis

Let’s Flink on EKS: Data Lake Primer

Check out the latest blog by Our Senior Consultant Howard Hill where he offers an engineer’s guide to streamlining real-time data using an open-model infrastructure.

 

Read More Read More

The Why’s, What’s and How’s of Kubernetes Operators

September 27, 2023 | Blog, Kubernetes

The Why’s, What’s and How’s of Kubernetes Operators

Learn to create your first Kubernetes operator by checking out our Senior Consultant Michal Tusnio’s latest blog, “Kubernetes Operators – Whys, Hows and Whats” where he takes you on a journey from zero to operator.

Read More Read More

Kafka: Navigating GDPR Compliance

June 8, 2023 | Blog, Kafka

Kafka: Navigating GDPR Compliance

Check out Greg Nuttall’s latest blog where he looks at the challenges posed by GDPR’s “Right to be Forgotten” in the context of Apache Kafka, and delves deeper into three strategies for overcoming them.

Read More Read More

DZone Repost: Testing Serverless Functions

February 11, 2022 | AWS, Cloud, GCP, Kubernetes, Microservices, Open Source, Software Consultancy

DZone Repost: Testing Serverless Functions

Serverless functions are easy to install and upload, but we can’t ignore the basics. This article looks at different strategies related to testing serverless functions.

Read More Read More

Making Sense of Data with RDF* vs. LPG

January 31, 2022 | Blog, Data Engineering

Making Sense of Data with RDF* vs. LPG

There are two camps of Graph database, one side is RDF, where they are strict with their format, and somewhat limited for their extensibility. The other side is LPG, where they can define labels to the relationships. With its recent extension, RDF now allows users to add properties, thus becoming RDF*. In this blog, Ebru explores the structural and performance differences between LPG and RDF*.

Read More Read More

What you might have missed in Kubernetes 1.22 release

December 5, 2021 | Cloud, Kubernetes

What you might have missed in Kubernetes 1.22 release

Kubernetes’ second release in 2021, version 1.22, has been out for a little while now and with 1.23 on its way, we thought we’d take a look back. Kubernetes 1.22 was a highly comprehensive release with 53 enhancements in all three graduation levels: 13 features have graduated to stable, 24 enhancements reached beta status, and 16 new features have been accepted into the alpha stage. 

The latest version has some noteworthy security features such as running Kubelet without root access, pod security policies, and seccomp. There are also a couple of deprecated and removed APIs. In this blog, we’ll discuss the significant changes in v1.22, as well as how to handle the removed APIs.

Read More Read More

Kafka vs RabbitMQ: The Consumer-Driven Choice

July 20, 2021 | Blog, Data Engineering, Kafka

Kafka vs RabbitMQ: The Consumer-Driven Choice

Message and event-driven systems provide an array of benefits to organisations of all shapes and sizes. At their core, they help decouple producers and consumers so that each can work at their own pace without having to wait for the other – asynchronous processing at its best.

In fact, such systems enable a whole range of messaging patterns, offering varying levels of guarantees surrounding the processing and consumption options for clients. Take for example the publish/subscribe pattern, which enables one message to be broadcast and consumed by multiple consumers; or the competing consumer pattern, which enables a message to be processed once but with multiple concurrent consumers vying for the honour—essentially providing a way to distribute the load. The manner in which these patterns are actually realised however, depends a lot on the technology used, as each has its own approach and unique tradeoffs. 

In this article we will explore how this all applies to RabbitMQ and Apache Kafka, and how these two technologies differ, specifically from a message consumer’s perspective.

Read More Read More

Machine Learning at scale: first impressions of Kubeflow

April 20, 2021 | Data Engineering, Machine Learning, Software Consultancy

Machine Learning at scale: first impressions of Kubeflow

Our recent client was a Fintech who had ambitions to build a Machine Learning platform for real-time decision making. The client had significant Kubernetes proficiency, ran on the cloud, and had a strong preference for using free, open-source software over cloud-native offerings that come with lock-in. Several components were spiked with success (feature preparation with Apache Beam and Seldon for model serving performed particularly strongly). Kubeflow was one of the next technologies on our list of spikes, showing significant promise at the research stage and seemingly a good match for our client’s priorities and skills.

That platform slipped down the client’s priority list before completing the research for Kubeflow, so I wanted to see how that project might have turned out. Would Kubeflow have made the cut?

 

Read More Read More

Decision time with AWS Keyspaces

September 22, 2020 | AWS, Blog, Cassandra, Cloud, DevOps, Open Source

Decision time with AWS Keyspaces

With the upcoming Cassandra 4.0 release, there is a lot to look forward to. Most excitingly, and following a refreshing realignment of the Open Source community around Cassandra, the next release promises to focus on fundamentals: stability, repair, observability, performance and scaling.

We must set this against the fact that Cassandra ranks pretty highly in the Stack Overflow most dreaded databases list and the reality that Cassandra is expensive to configure, operate and maintain. Finding people who have the prerequisite skills to do so is challenging.

Read More Read More

Three Highlights from CloudNative London Conference Day 3

October 8, 2019 | Cloud, Cloud Native, Culture

Three Highlights from CloudNative London Conference Day 3

Following on from the last two blogs by Stuart (who shared highlights for day 1) and Trent (who shared highlights from day 2), I will conclude with mine on CloudNative London 2019 Day 3.

The Cloud Native landscape can be bewildering, and not only for newcomers. As a traveller on the Cloud-native journey, I have sometimes been overwhelmed by the number of products and projects. This is why I took hold of the opportunity to go to Day 3 of the Cloud Native London Conference last month hosted by Skills Matter.

Here are my top highlights from Day 3 of the CloudNative London 2019.

Read More Read More

Kafka Connect – Source Connectors: A detailed guide to connecting to what you love.

July 30, 2019 | Blog, Kafka

Kafka Connect – Source Connectors: A detailed guide to connecting to what you love.

Writing your own Kafka source connectors with Kafka Connect. In this blog, Rufus takes you on a code walk, through the Gold Verified Venafi Connector while pointing out the common pitfalls

Read More Read More

HA Prometheus – The Thanos Evolution

February 5, 2019 | Cloud Native, DevOps, Kubernetes, Microservices, Open Source

HA Prometheus – The Thanos Evolution

While Prometheus has fast become the standard for monitoring in the cloud, making Prometheus highly available can be tricky. This blog post will walk you through how to do this using the open source tool Thanos.

Read More Read More

Heuristics for Identifying Service Boundaries

May 16, 2018 | Microservices

Heuristics for Identifying Service Boundaries

To identify service boundaries, it is not enough to consider (business) domains only. Other forces like organisational communication structures, and – very important – time, strongly suggest that we should include several other criteria in our considerations.

Read More Read More

Google Cloud Functions with Terraform

April 5, 2018 | Cloud, DevOps, Hashicorp, Terraform Provider

Google Cloud Functions with Terraform

Google Cloud Functions is the Google Cloud Platform (GCP) function-as-a-service offering. It allows you to execute your code in response to event triggers – HTTP, PubSub and Storage. While it currently only supports Node.js code for execution, it has proved very useful for running low-frequency operational tasks and other batch jobs in GCP.

Read More Read More

Events and Commands: Two Faces of the Same Coin?

March 14, 2018 | Data Engineering, Microservices

Events and Commands: Two Faces of the Same Coin?

Events are obviously the fundamental building block of event-sourced systems. Commands are equally a common concept in such systems although the distinction between events and commands, if any, is not always clear. There are certainly varying views on what role each one should play.

Read More Read More

Fargate As An Enabler For Serverless Continuous Delivery

February 14, 2018 | Cloud

Fargate As An Enabler For Serverless Continuous Delivery

AWS Announced a few new products for use with containers at RE:Invent 2017 and of particular interest to me was a new Elastic Container Service(ECS) Launch type, called Fargate

Prior to Fargate, when it came to creating a continuous delivery pipeline in AWS, the use of containers through ECS in its standard form, was the closest you could get to an always up, hands off, managed style of setup. Traditionally ECS has allowed you to create a configured pool of “worker” instances, with it then acting as a scheduler, provisioning containers on those instances.

 

Read More Read More

Riak, the Dynamo paper and life beyond Basho

August 8, 2017 | Cassandra

Riak, the Dynamo paper and life beyond Basho

Recently, the sad news has emerged that Basho, which developed the Riak distributed database, has gone into receivership. This would appear to present a problem for those who have adopted the commercial version of the Riak database (Riak KV) supported by Basho.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

OpenCredo Cloud Report: July 2017

July 11, 2017 | Cloud, Cloud Native

OpenCredo Cloud Report: July 2017

Over the years, OpenCredo’s projects have become increasingly tied to the public cloud. Our skills in delivering cloud infrastructure and cloud native applications have deepened and the range of cloud projects we are able to take on has grown. From enterprise cloud brokers to cloud platform migration in restricted compliance environments, our ability to deliver on the cloud is now a core component of our value proposition.

Read More Read More

CockroachDB: First Impressions

June 15, 2017 | Data Engineering

CockroachDB: First Impressions

CockroachDB is a distributed SQL (“NewSQL”) database developed by Cockroach Labs and has recently reached a major milestone: the first production-ready, 1.0 release. We at OpenCredo have been following the progress of CockroachDB for a while, and we think it’s a technology of great potential to become the go-to solution for a having a general-purpose database in the cloud.

Read More Read More

Data Analytics using Cassandra and Spark

March 23, 2017 | Cassandra, Data Analysis, Data Engineering

Data Analytics using Cassandra and Spark

In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.

However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.

Read More Read More

Programmable Infrastructure Needs Testing Too

March 20, 2017 | DevOps

Programmable Infrastructure Needs Testing Too

DevOps has swept the tech landscape. Now, many are discovering the benefits of programmable infrastructure. I have been lucky to work on many projects where we’ve taken advantage of tools such as Terraform, Ansible, or Chef.

 

Read More Read More

Google Cloud Spanner: our first impressions

March 7, 2017 | Data Analysis, GCP

Google Cloud Spanner: our first impressions

Google has recently made its internal Spanner database available to the wider public, as a hosted solution on Google Cloud. This is a distributed relational/transactional database used inside for various Google projects (including F1, the advertising backend), promising high throughput, low latency and 99.999% availability. As such it is an interesting alternative to many open source or other hosted solutions. This whitepaper gives a good theoretical introduction into Spanner.

Read More Read More

The Three ‘R’s of Distributed Event Processing

January 25, 2017 | Cassandra

The Three ‘R’s of Distributed Event Processing

One of the simplest and best-understood models of computation is the Finite State Machine (FSM). An FSM has fixed range of states it can be in, and is always in one of these states. When an input arrives, this triggers a transition in the FSM from its current state to the next state. There may be several possible transitions to several different states, and which transition is chosen depends on the input.

Read More Read More

Common Problems with Cassandra Tombstones

September 27, 2016 | Cassandra, Data Engineering

Common Problems with Cassandra Tombstones

If there is one thing to understand about Cassandra, it is the fact that it is optimised for writes. In Cassandra everything is a write including logical deletion of data which results in tombstones – special deletion records. We have noticed that lack of understanding of tombstones is often the root cause of production issues our clients experience with Cassandra. We have decided to share a compilation of the most common problems with Cassandra tombstones and some practical advice on solving them.

Read More Read More

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

September 15, 2016 | Cassandra

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

Cassandra isn’t a relational database management system, but it has some features that make it look a bit like one. Chief among these is CQL, a query language with an SQL-like syntax. CQL isn’t a bad thing in itself – in fact it’s very convenient – but it can be misleading since it gives developers the illusion that they are working with a familiar data model, when things are really very different under the hood.

Read More Read More

Patterns of Successful Cassandra Data Modelling

September 6, 2016 | Cassandra

Patterns of Successful Cassandra Data Modelling

A growing number of clients are asking OpenCredo for help with using Apache Cassandra and solving specific problems they encounter. Clients have different use cases, requirements, implementation and teams but experience similar issues. We have noticed that Cassandra data modelling problems are the most consistent cause of Cassandra failing to meet their expectations. Data modelling is one of the most complex areas of using Cassandra and has many considerations.

Read More Read More

New Blog Series: Cassandra – What You May Learn the Hard Way

August 26, 2016 | Cassandra

New Blog Series: Cassandra – What You May Learn the Hard Way

At OpenCredo we have been working with Cassandra since 2012 and we are big fans of both open source Apache Cassandra and the capabilities of DataStax Enterprise. Over the years we have collected a great deal of experience throughout the company on how to deliver the benefits of Cassandra in real world projects and have also seen some common pitfalls that businesses have fallen into.

Read More Read More

Fulfilling the promise of Apache Cassandra performance

August 24, 2016 | Cassandra

Fulfilling the promise of Apache Cassandra performance

At OpenCredo we are seeing an increase in adoption of Apache Cassandra as a leading NoSQL database for managing large data volumes, but we have also seen many clients experiencing difficulty converting their high expectations into operational Cassandra performance. Here we present a high-level technical overview of the major strengths and limitations of Cassandra that we have observed over the last few years while helping our clients resolve the real-world issues that they have experienced.

Read More Read More

The Concursus Programming Model: Under the Hood

April 29, 2016 | Software Consultancy

The Concursus Programming Model: Under the Hood

In the previous two posts (part 1, and part 2), we looked at how Concursus uses method mapping to generate events from method calls on proxies, and to dispatch events to matching methods on event handlers and state class instances. This approach provides a concise, convenient client API to the Concursus event system; however the core of the system defines events and event-handling mechanisms without reference to any of the reflection-based machinery used to implement this API. It is perfectly possible (if comparatively cumbersome) to use a Concursus event store to read and write events without using reflection. In this post I’ll show how this is done, continuing with the “lightbulb” example introduced previously.

Read More Read More

Securing Terraform state with Vault

April 2, 2016 | Terraform Provider

Securing Terraform state with Vault

When it comes to automating the creation of infrastructure in cloud providers, Terraform (version at time of writing 0.6.14) has become one of my core go to tools in this space. It provides a fantastic declarative approach to describing the resources you want, and then takes care of making it so for you, keeping track of the state in either a local file or a remote store of some sort. Various bits of sensitive data is often provided as input to terraform.

Read More Read More

What’s new in Akka Persistence 2.4

October 28, 2015 | Software Consultancy

What’s new in Akka Persistence 2.4

Let’s have a quick look at the most interesting changes and new features that are now available to Akka users. As there are many new features to highlight in the new Akka release I will focus on those related to Akka Persistence first and cover other areas in a separate post.

Read More Read More

Join us at the Inaugural ContainerSched Conference

October 16, 2015 | Software Consultancy

Join us at the Inaugural ContainerSched Conference

Interested in Containers and Schedulers?

OpenCredo is helping Skillsmatter with the organisation of the inaugural ContainerSched conference, and we were last night in attendance at CodeNode, working our way to finalising the program alongside the Skillsmatter team. I’m pleased to say that the provisional lineup looks great (speaker acceptance emails are being sent out over the next few days), and so I wanted to share the details of some of the great content we have confirmed already.

Read More Read More

Microservice Platforms: Some Assembly [Still] Required. Part Two

September 20, 2015 | Microservices

Microservice Platforms: Some Assembly [Still] Required. Part Two

Working Locally with Microservices

Over the past five years I have worked within several projects that used a ‘microservice’-based architecture, and one constant issue I have encountered is the absence of standardised patterns for local development and ‘off the shelf’ development tooling that support this. When working with monoliths we have become quite adept at streamlining the development, build, test and deploy cycles. Development tooling to help with these processes is also readily available (and often integrated with our IDEs). For example, many platforms provide ‘hot reloading’ for viewing the effects of code changes in near-real time, automated execution of tests, regular local feedback from continuous integration servers, and tooling to enable the creation of a local environment that mimics the production stack.

Read More Read More

New Tricks With Dynamic Proxies In Java 8 (part 3)

August 18, 2015 | Software Consultancy

New Tricks With Dynamic Proxies In Java 8 (part 3)

In this post, the last in the New Tricks With Dynamic Proxies series (see part 1 and part 2), I’m going to look at using dynamic proxies to create bean-like value objects to represent records. The basic idea here is to have some untyped storage for a collection of property values, such as an array of Objects, and a typed wrapper around that storage which provides a convenient and type-safe access mechanism. A dynamic proxy is used to convert calls on getter and setter methods in the wrapper interface into calls which read and write values in the store.

Read More Read More

ContainerSched 2015 – Call for Papers Now Open!

August 16, 2015 | Kubernetes

ContainerSched 2015 – Call for Papers Now Open!

ContainerSched 2015

ContainerShed 2015Over the last few years there has been exponential growth in the interest of containers and schedulers – technology such as Docker, rkt, Mesos, and Kubernetes are now commonplace within the IT domain, and with the rise of microservices, these technologies are set to become even more popular. Skillsmatter are keen to drive forward the discussions and knowledge sharing within this area of technology, and have created a conference focused exclusively on containers and schedulers: ContainerSched!

Read More Read More

Building a Google analytics dashboard with Python3, Tornado and deploying it on OpenShift (for free)

August 5, 2015 | Data Analysis, Data Engineering

Building a Google analytics dashboard with Python3, Tornado and deploying it on OpenShift (for free)

A few weeks ago, we thought about building a Google analytics dashboard to give us easy access to certain elements of our Google Analytics web traffic. We saw some custom dashboards for bloggers, but nothing quite right for our goal, since we wanted the data on a big screen for everyone in the office to view.

Read More Read More

Call for Thoughts
A Simple Introduction to Complex Event Processing – Stock Ticker End-to-End Sample

February 8, 2012 | Data Analysis, Data Engineering

A Simple Introduction to Complex Event Processing – Stock Ticker End-to-End Sample

Most of the important players in this space are large IT corporations like Oracle and IBM with their commercial (read expensive) offerings.

While most of CEP products offer some great features, it’s license model and close code policy doesn’t allow developers to play with them on pet projects, which would drive adoption and usage of CEP in every day programming.

Read More Read More

Aleksa Vukotic of OpenCredo to talk at upcoming NOSQL Exchange