Open Credo

44 items found: Search results for "overview" in all categories x

October 23, 2014 | Cassandra

Spring Data Cassandra Overview

Spring Data Cassandra (SDC) is a community project under the Spring Data (SD) umbrella that provides convenient and familiar APIs to work with Apache Cassandra.

Read More Read More

Spring Data Hadoop – Objective Overview

June 30, 2013 | Software Consultancy

Spring Data Hadoop – Objective Overview

Spring Data Hadoop (SDH) is a Spring offshoot project that allows the invocation and configuration of Hadoop tasks within a Spring application context. It offers support for Hadoop jobs, HBase, Pig, Hive, Cascading and additionally JSR-223 scripting for job preparation and tidy-up.

It is most suited for use in organisations with existing Spring applications or investment in Spring expertise. Some SDH features replicate functionality of tools in the Hadoop ecosystem that DevOps engineers who maintain a Hadoop cluster will be more familiar with.

 

Read More Read More

Use Case: A Kafka Backup Solution

February 22, 2024 | Blog, Kafka

Use Case: A Kafka Backup Solution

Check out Peter Vegh’s latest blog where he explores a bespoke Kafka backup framework, discussing the approach, architecture and design process to implement and deliver the solution.

Read More Read More

The Importance of Chunking in RAG

February 6, 2024 | AWS, Blog, OPA

The Importance of Chunking in RAG

Check out the latest blog by our Consultant, Tristan Hosken, as he explores Retrieval Augmented Generation (RAG). Tristan provides insights into advantages and disadvantages of RAG through hands-on experiments with AWS’s Bedrock and Azure’s OpenAI service.

Read More Read More

How to use LLMs to Generate Coherent Long-Form Content using Hierarchical Expansion

October 30, 2023 | Blog

How to use LLMs to Generate Coherent Long-Form Content using Hierarchical Expansion

As impressive as they are, Large Language Models (LLMs) face difficulties when creating long-form content, primarily due to token limitations and inconsistencies in the output over time.

Together with Livy.ai, we developed a “Hierarchical Expansion” method to address these challenges and better the quality, flow, and structure of the content produced. Read further to learn more!

Read More Read More

The 2023 Mayor’s Business Climate Challenge (BCC) – Part 2

September 19, 2023 | Blog, Culture, News

The 2023 Mayor’s Business Climate Challenge (BCC) – Part 2

In this blog we share the latest developments in our efforts to create a more sustainable business as part of The 2023 Mayor’s Business Climate Challenge.

Read More Read More

Ingesting Big Data into Neo4j – Part 3

March 9, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 3

Check out the last part of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.

Read More Read More

Controlling Kafka Data Flows using Open Policy Agent

August 2, 2022 | Blog, Kafka

Controlling Kafka Data Flows using Open Policy Agent

Read Matt Farrow’s blog as he explores the potential for using Open Policy Agent to filter and mask data being sent to and read from Apache Kafka.

Read More Read More

Lunch & Learn: Organisational Transformation – Why it often fails, and tools for success

April 28, 2022 | Software Consultancy

Lunch & Learn: Organisational Transformation – Why it often fails, and tools for success

Watch our latest Lunch & Learn video as Simon Copsey provides an in-depth look at what causes Organisational Transformation failures and what can be done to ensure its success.

Read More Read More

Exploring How Policy-as-Code and OPA Fit into the K8s World

November 4, 2021 | Kubernetes

Exploring How Policy-as-Code and OPA Fit into the K8s World

We always read that ‘security is everyone’s responsibility’. For any organisation, big or small, security should always be the primary concern—not a mere afterthought. In terms of Kubernetes, securing a cluster is challenging because it has so many moving parts and, apart from securing our Kubernetes environment, we also want to control what an end-user can do in our cluster.

To achieve these goals, we can start with the built-in features provided by Kubernetes like Role-Based Access Control (RBAC), Network Policies, Secrets Management, and Pod Security Policies (PSP). But we know these features are not enough. For example, we may want specific policies like ‘all pods must have specific labels’. And even if we have the policies in place, the next big question is how to enforce them on our Kubernetes cluster in an easy and repeatable manner.

In this blog post, we’ll address this challenge and other questions pertaining to OPA and how it can integrate into Kubernetes.

Read More Read More

Kafka vs RabbitMQ: The Consumer-Driven Choice

July 20, 2021 | Blog, Data Engineering, Kafka

Kafka vs RabbitMQ: The Consumer-Driven Choice

Message and event-driven systems provide an array of benefits to organisations of all shapes and sizes. At their core, they help decouple producers and consumers so that each can work at their own pace without having to wait for the other – asynchronous processing at its best.

In fact, such systems enable a whole range of messaging patterns, offering varying levels of guarantees surrounding the processing and consumption options for clients. Take for example the publish/subscribe pattern, which enables one message to be broadcast and consumed by multiple consumers; or the competing consumer pattern, which enables a message to be processed once but with multiple concurrent consumers vying for the honour—essentially providing a way to distribute the load. The manner in which these patterns are actually realised however, depends a lot on the technology used, as each has its own approach and unique tradeoffs. 

In this article we will explore how this all applies to RabbitMQ and Apache Kafka, and how these two technologies differ, specifically from a message consumer’s perspective.

Read More Read More

Machine Learning at scale: first impressions of Kubeflow

April 20, 2021 | Data Engineering, Machine Learning, Software Consultancy

Machine Learning at scale: first impressions of Kubeflow

Our recent client was a Fintech who had ambitions to build a Machine Learning platform for real-time decision making. The client had significant Kubernetes proficiency, ran on the cloud, and had a strong preference for using free, open-source software over cloud-native offerings that come with lock-in. Several components were spiked with success (feature preparation with Apache Beam and Seldon for model serving performed particularly strongly). Kubeflow was one of the next technologies on our list of spikes, showing significant promise at the research stage and seemingly a good match for our client’s priorities and skills.

That platform slipped down the client’s priority list before completing the research for Kubeflow, so I wanted to see how that project might have turned out. Would Kubeflow have made the cut?

 

Read More Read More

Anthos – A Holistic Approach to your Hybrid Cloud initiative

February 17, 2021 | Blog, Cloud, Cloud Native, GCP, Open Source

Anthos – A Holistic Approach to your Hybrid Cloud initiative

Multi-cloud is rapidly becoming the cloud strategy of choice for enterprises looking to modernise their applications.

And the reason is simple – it gives them much more flexibility to host their workloads and data where it suits them best.

In this post, we focus on Google’s application modernisation solution Google Anthos and the role it can play in your cloud transformation strategy.

Read More Read More

Hacking Kubernetes on AWS (EKS) from a Mac

October 29, 2020 | Cloud, Kubernetes, Open Source

Hacking Kubernetes on AWS (EKS) from a Mac

While working with a client recently, we experienced some issues when attempting to make use of NLB external load balancer services when using AWS EKS. I wanted to investigate whether these issues had been fixed in the upstream GitHub Kubernetes repository, or if I could fix it myself, contributing back to the community in the process.

Read More Read More

3 Highlights from CloudNative London 2019 (Day 1)

October 1, 2019 | Cloud, Cloud Native, Culture

3 Highlights from CloudNative London 2019 (Day 1)

One of the benefits we have working at OpenCredo (OC) is the opportunity to both attend and speak (although not on this occasion) at conferences. For some of you, this may be pretty common, but OC was actually the first to offer me this as part of a broader learning and development plan.

Cloud-native development and delivery is a core area of expertise for OC and we are always looking for what’s new and interesting in this space. So when I was offered the chance to go to CloudNative London it seemed like a good place to start. With its diversity in topics and technologies, the conference provided a perfect opportunity to collaborate and hear from others in the industry and what they are doing in this space.

Read More Read More

Securing Kafka using Vault PKI

February 20, 2019 | DevOps, Hashicorp, Kafka, Open Source

Securing Kafka using Vault PKI

Creating and managing a Public Key Infrastructure (PKI) could be a very straightforward task if you use appropriate tools. In this blog post, I’ll cover the steps to easily set up a PKI with Vault from HashiCorp, and use it to secure a Kafka Cluster.

Read More Read More

The Power of the Architecture-driven Organisation

July 13, 2018 | Software Consultancy

The Power of the Architecture-driven Organisation

As a consultant I often find myself in situations that require tricky problem solving, typically of a technical nature. Yet although it is common to approach a consultancy engagement in terms of its technical context, not all problems have a purely technical solution.

Read More Read More

A Pragmatic Introduction to Machine Learning for DevOps Engineers

January 23, 2018 | Data Engineering, DevOps

A Pragmatic Introduction to Machine Learning for DevOps Engineers

Machine Learning is a hot topic these days, as can be seen from search trends. It was the success of Deepmind and AlphaGo in 2016 that really brought machine learning to the attention of the wider community and the world at large.

Read More Read More

Writing a custom JupyterHub Spawner

January 11, 2018 | Data Engineering

Writing a custom JupyterHub Spawner

The last few years have seen Python emerge as a lingua franca for data scientists. Alongside Python we have also witnessed the rise of Jupyter Notebooks, which are now considered a de facto data science productivity tool, especially in the Python community. Jupyter Notebooks started as a university side-project known as iPython in circa 2001 at UC Berkeley.

Read More Read More

Terraform Provider Development

August 9, 2017 | Cloud, DevOps, Terraform Provider

Terraform Provider Development

The recent 0.10.0 release of HashiCorp Terraform, saw a significant change to the way Providers are managed. Specifically, the single open source code repository for Terraform has been divided into core and multiple provider repositories.

 

Read More Read More

QCON London 2017 (Video): Monitoring Serverless Architectures
Testing a Spark Application

May 9, 2017 | Cassandra

Testing a Spark Application

Data analytics isn’t a field commonly associated with testing, but there’s no reason we can’t treat it like any other application. Data analytics services are often deployed in production, and production services should be properly tested. This post covers some basic approaches for the testing of Cassandra/Spark code. There will be some code examples, but the focus is on how to structure your code to ensure it is testable!

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Deploy Spark with an Apache Cassandra cluster

May 2, 2017 | Cassandra, Data Engineering

Deploy Spark with an Apache Cassandra cluster

My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.

Read More Read More

Data Analytics using Cassandra and Spark

March 23, 2017 | Cassandra, Data Analysis, Data Engineering

Data Analytics using Cassandra and Spark

In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.

However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.

Read More Read More

Let’s Encrypt and Terraform – Getting free certificates for your infrastructure

January 24, 2017 | Cloud

Let’s Encrypt and Terraform – Getting free certificates for your infrastructure

This blog aims to provide an end to end example of how you can automatically request, generate and install a free HTTPS/TLS/SSL certificate from Let’s Encrypt using Terraform. Let’s Encrypt is a free, automated, and open certificate authority (CA) aiming to make it super easy (and free – did I say free!) for people to obtain HTTPS (SSL/TLS) certificates for their websites and infrastructure. Under the hood, Let’s Encrypt implements and leverages an emerging protocol called ACME to make all this magic happen, and it is this ACME protocol that powers the Terraform provider we will be using. For more information on how Let’s Encrypt and the ACME protocol actually work, please see how Let’s Encrypt works.

Read More Read More

Common Problems with Cassandra Tombstones

September 27, 2016 | Cassandra, Data Engineering

Common Problems with Cassandra Tombstones

If there is one thing to understand about Cassandra, it is the fact that it is optimised for writes. In Cassandra everything is a write including logical deletion of data which results in tombstones – special deletion records. We have noticed that lack of understanding of tombstones is often the root cause of production issues our clients experience with Cassandra. We have decided to share a compilation of the most common problems with Cassandra tombstones and some practical advice on solving them.

Read More Read More

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

September 15, 2016 | Cassandra

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

Cassandra isn’t a relational database management system, but it has some features that make it look a bit like one. Chief among these is CQL, a query language with an SQL-like syntax. CQL isn’t a bad thing in itself – in fact it’s very convenient – but it can be misleading since it gives developers the illusion that they are working with a familiar data model, when things are really very different under the hood.

Read More Read More

Patterns of Successful Cassandra Data Modelling

September 6, 2016 | Cassandra

Patterns of Successful Cassandra Data Modelling

A growing number of clients are asking OpenCredo for help with using Apache Cassandra and solving specific problems they encounter. Clients have different use cases, requirements, implementation and teams but experience similar issues. We have noticed that Cassandra data modelling problems are the most consistent cause of Cassandra failing to meet their expectations. Data modelling is one of the most complex areas of using Cassandra and has many considerations.

Read More Read More

Fulfilling the promise of Apache Cassandra performance

August 24, 2016 | Cassandra

Fulfilling the promise of Apache Cassandra performance

At OpenCredo we are seeing an increase in adoption of Apache Cassandra as a leading NoSQL database for managing large data volumes, but we have also seen many clients experiencing difficulty converting their high expectations into operational Cassandra performance. Here we present a high-level technical overview of the major strengths and limitations of Cassandra that we have observed over the last few years while helping our clients resolve the real-world issues that they have experienced.

Read More Read More

Microservices Manchester (#micromanchester) Conference Recap

July 8, 2016 | Microservices

Microservices Manchester (#micromanchester) Conference Recap

OpenCredo recently co-organised the first Microservices Manchester event with OliverBernard recruitment, and it was a resounding success. Over 100 people showed up at the Victoria Warehouse near Manchester’s trendy Salford Quays for a day discussing the realities of implementing microservice systems.

Read More Read More

Securing Terraform state with Vault

April 2, 2016 | Terraform Provider

Securing Terraform state with Vault

When it comes to automating the creation of infrastructure in cloud providers, Terraform (version at time of writing 0.6.14) has become one of my core go to tools in this space. It provides a fantastic declarative approach to describing the resources you want, and then takes care of making it so for you, keeping track of the state in either a local file or a remote store of some sort. Various bits of sensitive data is often provided as input to terraform.

Read More Read More

React/Redux boilerplate

March 3, 2016 | Software Consultancy

React/Redux boilerplate

In this post, I’ll be sharing some React/Redux boilerplate code that Vince Martinez and I have been developing recently. It’s primarily aimed at developers who are familiar with the React ecosystem, so if you are new to React and/or Redux, you might like to have a look at Getting Started with React and Getting Started with Redux.

Read More Read More

Kotlin: a new JVM language you should try

March 3, 2016 | Software Consultancy

Kotlin: a new JVM language you should try

JetBrains (the people behind IntelliJ IDEA) have recently announced  the first RC for version 1.0 of Kotlin, a new programming language for the JVM. I say ‘new’, but Kotlin has been in the making for a few years now, and has been used by JetBrains to develop several of their products, including Intellij IDEA. The company open-sourced Kotlin in 2011, and have worked with the community since then to make the language what it is today.

Read More Read More

ContainerSched 2016: The Container & Scheduler Conference – Call for Papers
JavaOne: Debugging Java Applications Running in Docker

November 3, 2015 | Software Consultancy

JavaOne: Debugging Java Applications Running in Docker

My JavaOne experience was rather busy this year, what with three talks presented in a single day! The first of these talks “Debugging Java Apps in Containers: No Heavy Welding Gear Required” was delivered with my regular co-presenter Steve Poole, from IBM, and we shared our combined experiences of working with Java and Docker over the past year.

Read More Read More

Join us at the Inaugural ContainerSched Conference

October 16, 2015 | Software Consultancy

Join us at the Inaugural ContainerSched Conference

Interested in Containers and Schedulers?

OpenCredo is helping Skillsmatter with the organisation of the inaugural ContainerSched conference, and we were last night in attendance at CodeNode, working our way to finalising the program alongside the Skillsmatter team. I’m pleased to say that the provisional lineup looks great (speaker acceptance emails are being sent out over the next few days), and so I wanted to share the details of some of the great content we have confirmed already.

Read More Read More

SaltStack – Using Consul as an External Pillar Source

September 14, 2015 | DevOps, Hashicorp, Open Source

SaltStack – Using Consul as an External Pillar Source

Recently I was working on a project that was using SaltStack for configuration management and Consul for service discovery. It occurred to me that using Consul’s key/value store would be great place to store data needed for my Salt runs, but unfortunately Consul was not supported in SaltStack as an official data store at that point in time. Being an open source project however, this provided an excellent opportunity to contribute back and this blog post looks to provide some details on how this works, as well as a practical demo on how you can take advantage of Consul as an external data store.

Read More Read More

ContainerSched 2015 – Call for Papers Now Open!

August 16, 2015 | Kubernetes

ContainerSched 2015 – Call for Papers Now Open!

ContainerSched 2015

ContainerShed 2015Over the last few years there has been exponential growth in the interest of containers and schedulers – technology such as Docker, rkt, Mesos, and Kubernetes are now commonplace within the IT domain, and with the rise of microservices, these technologies are set to become even more popular. Skillsmatter are keen to drive forward the discussions and knowledge sharing within this area of technology, and have created a conference focused exclusively on containers and schedulers: ContainerSched!

Read More Read More

Boot my (secure)->(gov) cloud

August 10, 2015 | Cloud, Software Consultancy

Boot my (secure)->(gov) cloud

As a company, we at OpenCredo are heavily involved in automation and devOps based work, with a keen focus on making this a seamless experience, especially in cloud based environments. We are currently working within HMRC, a UK government department to help make this a reality as part of a broader cloud broker ecosystem project. In this blog post, I look to provide some initial insight into some of the tools and techniques employed to achieve this for one particular use case namely:
With pretty much zero human intervention, bar initiating a process and providing some inputs, a development team from any location, should be able to run “something”, which, in the end, results in an isolated, secure set of fully configured VM’s being provisioned within a cloud provider (or providers) of choice.

Read More Read More

July 28, 2015 | Microservices

Documenting REST APIs – a tooling review

Recently I co-presented a talk at Goto Amsterdam on lessons learnt whilst developing with a Microservices architecture; one being the importance of defining and documenting your API contracts as early as possible in the development cycle. During the talk I mentioned a few API documentation tools that I’d used and, based on feedback and questions from attendees, I realised that this topic merited a blog post. So, the purpose of this is to introduce 5 tools which help with designing, testing and documenting APIs.

Read More Read More

Asynchronous Cloud bootstrapping with Terraform, Cloud-Init & Puppet

June 23, 2015 | Cloud, DevOps, Terraform Provider

Asynchronous Cloud bootstrapping with Terraform, Cloud-Init & Puppet

Working with OpenCredo clients, I’ve noticed that even if you are one of the few organisations that can boast ‘Infrastructure as Code’, perhaps it’s only true of your VMs, and likely you have ‘bootstrap problems’. What I mean by this, is that you require some cloud-infrastructure to already be in place before your VM automation can go to work.

Read More Read More

Round up from the Kubernetes London meetup / Vol.2
New features in Cassandra 2.0 – Lightweight Transactions on Insert

January 6, 2014 | Cassandra

New features in Cassandra 2.0 – Lightweight Transactions on Insert

The team over at Cucumber Pro recently posted a sneak peek on their blog, demonstrating some key features of their offering.

As more of a technical user of Cucumber, there isn’t much that’s new or ground-breaking for me – almost every feature is already available through your preferred IDE combined with a few plugins.

Read More Read More

A dive into saltstack

January 10, 2013 | DevOps

A dive into saltstack

Recently I have started looking into SaltStack as a solution that does both config management and orchestration. It is a relatively new project started in 2011, but it has a growing fanbase among Sys Admins and DevOps Engineers. In this blog post I will look into Salt as a promising alternative, and comparing it to Puppet as a way of exploring its basic set of features.

Read More Read More