Open Credo

50 items found: Search results for "series" in all categories x

New Blog Series: Spark – The Pragmatic Bits

April 25, 2017 | Cassandra, Data Analysis, Data Engineering

New Blog Series: Spark – The Pragmatic Bits

Apache Spark is a powerful open source processing engine which is fast becoming our technology of choice for data analytic projects here at OpenCredo. For many years now we have been helping our clients to practically implement and take advantage of various big data technologies including the like of Apache Cassandra amongst others.

 

 

Read More Read More

New Blog Series: Cassandra – What You May Learn the Hard Way

August 26, 2016 | Cassandra

New Blog Series: Cassandra – What You May Learn the Hard Way

At OpenCredo we have been working with Cassandra since 2012 and we are big fans of both open source Apache Cassandra and the capabilities of DataStax Enterprise. Over the years we have collected a great deal of experience throughout the company on how to deliver the benefits of Cassandra in real world projects and have also seen some common pitfalls that businesses have fallen into.

Read More Read More

Webinar series on MongoDB within Financial Services
The 2023 Mayor’s Business Climate Challenge (BCC) – Final Part

March 5, 2024 | Blog, Culture, News

The 2023 Mayor’s Business Climate Challenge (BCC) – Final Part

Learn more about our efforts and our progress towards becoming an environmentally friendly company for the Mayor’s Business Climate Challenge (BCC) in 2023 in this final update.

Read More Read More

Ingesting Big Data into Neo4j – Part 3

March 9, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 3

Check out the last part of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.

Read More Read More

Ingesting Big Data into Neo4j – Part 2

February 16, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 2

Check out Part 2 of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.

Read More Read More

Ingesting Big Data into Neo4j – Part 1

January 26, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 1

Fahran Wallace and Ebru Cucen’s most recent blog post is part 1 of a three-part series. They investigate how OpenCredo ingested 400 million nodes with a billion relationships into Neo4j.

Read More Read More

PlatformCon 2022: People, Process, and Platform – a community-focused approach

May 26, 2022

PlatformCon 2022: People, Process, and Platform – a community-focused approach

Our CEO/CTO Nicki Watt shares about her upcoming talk at PlatformCon 2022.

Read More Read More

Anthos – A Holistic Approach to your Hybrid Cloud initiative

February 17, 2021 | Blog, Cloud, Cloud Native, GCP, Open Source

Anthos – A Holistic Approach to your Hybrid Cloud initiative

Multi-cloud is rapidly becoming the cloud strategy of choice for enterprises looking to modernise their applications.

And the reason is simple – it gives them much more flexibility to host their workloads and data where it suits them best.

In this post, we focus on Google’s application modernisation solution Google Anthos and the role it can play in your cloud transformation strategy.

Read More Read More

Writing a custom JupyterHub Spawner

January 11, 2018 | Data Engineering

Writing a custom JupyterHub Spawner

The last few years have seen Python emerge as a lingua franca for data scientists. Alongside Python we have also witnessed the rise of Jupyter Notebooks, which are now considered a de facto data science productivity tool, especially in the Python community. Jupyter Notebooks started as a university side-project known as iPython in circa 2001 at UC Berkeley.

Read More Read More

Riak, the Dynamo paper and life beyond Basho

August 8, 2017 | Cassandra

Riak, the Dynamo paper and life beyond Basho

Recently, the sad news has emerged that Basho, which developed the Riak distributed database, has gone into receivership. This would appear to present a problem for those who have adopted the commercial version of the Riak database (Riak KV) supported by Basho.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Detecting stolen AWS credential usage with Apache Spark – Webinar Recording

May 22, 2017 | Data Analysis

Detecting stolen AWS credential usage with Apache Spark – Webinar Recording

As a final piece of our recent blog series about Apache Spark on 16 May we have presented details of a use-case about using Spark Structured Streaming to generate real-time alerts of suspicious activity in an AWS-based infrastructure.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Testing a Spark Application

May 9, 2017 | Cassandra

Testing a Spark Application

Data analytics isn’t a field commonly associated with testing, but there’s no reason we can’t treat it like any other application. Data analytics services are often deployed in production, and production services should be properly tested. This post covers some basic approaches for the testing of Cassandra/Spark code. There will be some code examples, but the focus is on how to structure your code to ensure it is testable!

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

Deploy Spark with an Apache Cassandra cluster

May 2, 2017 | Cassandra, Data Engineering

Deploy Spark with an Apache Cassandra cluster

My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.

Read More Read More

Data Analytics using Cassandra and Spark

March 23, 2017 | Cassandra, Data Analysis, Data Engineering

Data Analytics using Cassandra and Spark

In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.

However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.

Read More Read More

Automating Your Security Acceptance Tests

March 23, 2017 | Data Engineering, Machine Learning

Automating Your Security Acceptance Tests

On previous blog posts we have provided examples of different types of acceptance tests coverage, UI, API and Performance. One area where automation is often lacking is around validating the security of the application under test. This has been discussed in the post on non functional testing You Are Ignoring Non-functional Testing. With this post we will enhance the automation framework to quickly check for some common security flaws.

Read More Read More

Reactive event processing with Reactor Core: a first look

January 26, 2017 | Data Engineering

Reactive event processing with Reactor Core: a first look

Suppose you are given the task of writing code that fulfils the following contract:

  • You will be given a promise that, at some point in the future, some data – a series of values – will become available.
  • In return, you will supply a promise that, at some point in the future, some data representing the results of processing that data will become available.
  • There may be more values to process than you can fit in memory, or even an infinite series of values.
  • You are allowed to specify what will be done with each individual value, as and when it becomes available; this includes discarding some values.
  • Whenever you want to use some external service to do something with a value, that service can only return you a promise that, at some point in the future, some data representing the result of processing that value will become available.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

Read More Read More

The Three ‘R’s of Distributed Event Processing

January 25, 2017 | Cassandra

The Three ‘R’s of Distributed Event Processing

One of the simplest and best-understood models of computation is the Finite State Machine (FSM). An FSM has fixed range of states it can be in, and is always in one of these states. When an input arrives, this triggers a transition in the FSM from its current state to the next state. There may be several possible transitions to several different states, and which transition is chosen depends on the input.

Read More Read More

Cassandra – The Good, The Bad and the Ugly Webinar Recording

October 10, 2016 | Cassandra

Cassandra – The Good, The Bad and the Ugly Webinar Recording

In the culmination of our blog series on the topic, on October 6th 2016 OpenCredo Consultants Dominic Fox, Alla Babkina and Guy Richardson, and hosted by Marco Cullen, presented the common design and implementation issues that they have come across in real-world Apache Cassandra deployments.

Read More Read More

Common Problems with Cassandra Tombstones

September 27, 2016 | Cassandra, Data Engineering

Common Problems with Cassandra Tombstones

If there is one thing to understand about Cassandra, it is the fact that it is optimised for writes. In Cassandra everything is a write including logical deletion of data which results in tombstones – special deletion records. We have noticed that lack of understanding of tombstones is often the root cause of production issues our clients experience with Cassandra. We have decided to share a compilation of the most common problems with Cassandra tombstones and some practical advice on solving them.

Read More Read More

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

September 15, 2016 | Cassandra

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

Cassandra isn’t a relational database management system, but it has some features that make it look a bit like one. Chief among these is CQL, a query language with an SQL-like syntax. CQL isn’t a bad thing in itself – in fact it’s very convenient – but it can be misleading since it gives developers the illusion that they are working with a familiar data model, when things are really very different under the hood.

Read More Read More

Patterns of Successful Cassandra Data Modelling

September 6, 2016 | Cassandra

Patterns of Successful Cassandra Data Modelling

A growing number of clients are asking OpenCredo for help with using Apache Cassandra and solving specific problems they encounter. Clients have different use cases, requirements, implementation and teams but experience similar issues. We have noticed that Cassandra data modelling problems are the most consistent cause of Cassandra failing to meet their expectations. Data modelling is one of the most complex areas of using Cassandra and has many considerations.

Read More Read More

Kubernetes from scratch to AWS with Terraform and Ansible (part 1)

August 26, 2016 | Kubernetes

Kubernetes from scratch to AWS with Terraform and Ansible (part 1)

This post is the first of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible.

Read More Read More

Fulfilling the promise of Apache Cassandra performance

August 24, 2016 | Cassandra

Fulfilling the promise of Apache Cassandra performance

At OpenCredo we are seeing an increase in adoption of Apache Cassandra as a leading NoSQL database for managing large data volumes, but we have also seen many clients experiencing difficulty converting their high expectations into operational Cassandra performance. Here we present a high-level technical overview of the major strengths and limitations of Cassandra that we have observed over the last few years while helping our clients resolve the real-world issues that they have experienced.

Read More Read More

Concursus: Event Sourcing for the Internet of Things

May 10, 2016 | Data Engineering, White Paper

Concursus: Event Sourcing for the Internet of Things

In this technical report, we present Concursus, a framework for developing distributed applications using CQRS and event sourcing patterns within a modern, Java 8-centric, programming model. Following a high-level survey of the trends leading towards the adoption of these patterns, we show how Concursus simplifies the task of programming event sourcing applications by providing a concise, intuitive API to systems composed of event processing middleware.

Read More Read More

The Concursus Programming Model: Kotlin

April 29, 2016 | Software Consultancy

The Concursus Programming Model: Kotlin

In this post, I’ll demonstrate an alternative API which uses some of the advanced language features of the new Kotlin language from Jetbrains. As Kotlin is a JVM-based language, it interoperates seamlessly with Concursus’s Java 8 classes; however, it also offers powerful ways to extend their functionality.

Read More Read More

The Concursus Programming Model: State

April 28, 2016 | Software Consultancy

The Concursus Programming Model: State

In a conventional RDBMS-with-ORM system, we are used to thinking of domain objects as mapped to rows in database tables, and of the database as a repository where the current state of every object exists simultaneously, so that what we get when we query for an object is the state that object was in at the time the query was issued. To perform an update, we can start a transaction, retrieve the current state of the object, modify it, save it back again and commit. Transactions move the global state of the system from one consistent state to another, so that the database transaction log represents a single, linear history of updates. We are therefore able to have a very stable, intuitive sense of what it means to talk about the “current state” of any domain object.

Read More Read More

Test Automation Concepts – Automated email testing

March 29, 2016 | Software Consultancy

Test Automation Concepts – Automated email testing

Raise your test coverage with automated email testing

Acceptance test suites generally are used for UI and API testing, and we have covered both these approaches in our Test Automation Quickstart project. However, an application may, for example, send registration or expiration warning emails. Often, tests related to this are left to manual testing, instead of putting them into an automated test suite.

However, there’s no need to check emails manually: it suffers from all the same problems as other manual testing. It’s slow, expensive, and inconsistent. There are many libraries available to interact with email through code – this post will focus on how to use them within an automated test suite.

Read More Read More

Test Automation Concepts – Parallel test execution

March 14, 2016 | Software Consultancy

Test Automation Concepts – Parallel test execution

Test automation provides fast feedback on regressions. In order to achieve this tests need to execute quickly, something which becomes more of a problem as test suites grow. This is especially true of tests which exercise a user interface where the interaction with the system is slower.

A good way to address this is to have your tests execute in parallel rather than consecutively. Given sufficient resources this allows your execution time to remain low almost indefinitely as more scenarios are added to the suite.

Read More Read More

React/Redux boilerplate

March 3, 2016 | Software Consultancy

React/Redux boilerplate

In this post, I’ll be sharing some React/Redux boilerplate code that Vince Martinez and I have been developing recently. It’s primarily aimed at developers who are familiar with the React ecosystem, so if you are new to React and/or Redux, you might like to have a look at Getting Started with React and Getting Started with Redux.

Read More Read More

Kotlin: a new JVM language you should try

March 3, 2016 | Software Consultancy

Kotlin: a new JVM language you should try

JetBrains (the people behind IntelliJ IDEA) have recently announced  the first RC for version 1.0 of Kotlin, a new programming language for the JVM. I say ‘new’, but Kotlin has been in the making for a few years now, and has been used by JetBrains to develop several of their products, including Intellij IDEA. The company open-sourced Kotlin in 2011, and have worked with the community since then to make the language what it is today.

Read More Read More

Hazelcast and Spring-managed Transactions: A Sample Integration

January 26, 2016 | Data Engineering

Hazelcast and Spring-managed Transactions: A Sample Integration

In this second post about Hazelcast and Spring, I’m integrating Hazelcast and Spring-managed transaction for a specific use case: A transactional Queue. More specifically, I want to make the message polling, of my sample chat application, transactional.

Read More Read More

Akka Typed brings type safety to Akka framework

January 18, 2016 | Software Consultancy

Akka Typed brings type safety to Akka framework

Last time in this series I summarised all the Akka Persistence related improvements in Akka 2.4. Since then Akka 2.4.1 has been released with some additional bug fixes and improvements so perhaps now is a perfect time to pick up this mini-series and introduce some other new features included in Akka 2.4.x.

Read More Read More

The Seven Deadly Sins of Microservices (Redux)

January 8, 2016 | Microservices

The Seven Deadly Sins of Microservices (Redux)

Many of our clients are in the process of investigating or implementing ‘microservices’, and a popular question we often get asked is “what’s the most common mistake you see when moving towards a microservice architecture?”. We’ve seen plenty of good things with this architectural pattern, but we have also seen a few recurring issues and anti-patterns, which I’m keen to share here.

Read More Read More

(Spring) Booting Hazelcast

December 1, 2015 | Software Consultancy

(Spring) Booting Hazelcast

This post introduce some of the basic features of Hazelcast, some of its limitations, how to embed it in a Spring Boot application and write integration testings. This post is intended to be the first of a series about Hazelcast and its integration with Spring (Boot). Let’s start from the basics.

Read More Read More

JavaOne: Debugging Java Applications Running in Docker

November 3, 2015 | Software Consultancy

JavaOne: Debugging Java Applications Running in Docker

My JavaOne experience was rather busy this year, what with three talks presented in a single day! The first of these talks “Debugging Java Apps in Containers: No Heavy Welding Gear Required” was delivered with my regular co-presenter Steve Poole, from IBM, and we shared our combined experiences of working with Java and Docker over the past year.

Read More Read More

Introduction to Akka Streams – Getting started

October 1, 2015 | Data Engineering

Introduction to Akka Streams – Getting started

Going reactive

Akka Streams, the new experimental module under Akka project has been finally released in July after some months of development and several milestone and RC versions. In this series I hope to gently introduce concepts from the library and demonstrate how it can be used to address real-life stream processing challenges.

Akka Streams is an implementation of the Reactive Streams specification on top of Akka toolkit that uses actor based concurrency model. Reactive Streams specification has been created by the number of companies interested in asynchronous, non-blocking, event based data processing that can span across system boundaries and technology stacks.

Read More Read More

The Business Behind Microservices Webinar (Video and Slides)

September 24, 2015 | Microservices

The Business Behind Microservices Webinar (Video and Slides)

Unless you’ve been living under a (COBOL-based) rock for the last few years, you will have no doubt heard of the emerging trend of microservices. This approach to developing ‘loosely coupled service-oriented architecture with bounded contexts’ has captured the hearts and minds of many developers. The promise of easier enforcement of good architectural and design principles, such as encapsulation and interface segregation, combined with the availability to experiment with different languages and platforms for each service, is a (developer) match made in heaven.

Read More Read More

Microservice Platforms: Some Assembly [Still] Required. Part Two

September 20, 2015 | Microservices

Microservice Platforms: Some Assembly [Still] Required. Part Two

Working Locally with Microservices

Over the past five years I have worked within several projects that used a ‘microservice’-based architecture, and one constant issue I have encountered is the absence of standardised patterns for local development and ‘off the shelf’ development tooling that support this. When working with monoliths we have become quite adept at streamlining the development, build, test and deploy cycles. Development tooling to help with these processes is also readily available (and often integrated with our IDEs). For example, many platforms provide ‘hot reloading’ for viewing the effects of code changes in near-real time, automated execution of tests, regular local feedback from continuous integration servers, and tooling to enable the creation of a local environment that mimics the production stack.

Read More Read More

Software Circus: Thinking Fast and Slow with Software Development

September 13, 2015 | DevOps

Software Circus: Thinking Fast and Slow with Software Development

Making Good Decisions within Software

Last week I was privileged to be able to present my “Thinking Fast and Slow with Software Development” talk at the inaugural Software Circus conference in Amsterdam. The conference was amazing, and I’ll write more about this later, but in this post I was keen to share the presentation slides and the thinking behind this talk…

Read More Read More

Microservice Platforms: Some Assembly [Still] Required. Part One

August 26, 2015 | Cloud

Microservice Platforms: Some Assembly [Still] Required. Part One

The challenges of building and deploying microservices

Unless you’ve been living under a rock for the last year, you’ll undoubtedly know that microservices are the new hotness. An emerging trend that I’ve observed is that the people who are actually using microservices in production tend to be the larger well-funded companies, such as Netflix, Gilt, Yelp, Hailo etc., and each organisation has their own way of developing, building and deploying.

Read More Read More

New Tricks With Dynamic Proxies In Java 8 (part 3)

August 18, 2015 | Software Consultancy

New Tricks With Dynamic Proxies In Java 8 (part 3)

In this post, the last in the New Tricks With Dynamic Proxies series (see part 1 and part 2), I’m going to look at using dynamic proxies to create bean-like value objects to represent records. The basic idea here is to have some untyped storage for a collection of property values, such as an array of Objects, and a typed wrapper around that storage which provides a convenient and type-safe access mechanism. A dynamic proxy is used to convert calls on getter and setter methods in the wrapper interface into calls which read and write values in the store.

Read More Read More

The Past, Present and Future of Kubernetes

August 7, 2015 | Kubernetes

The Past, Present and Future of Kubernetes

Learning about the benefits of Kubernetes from the Kismatic Team

As part of my writing for InfoQ, I recently had the pleasure of sitting down and chatting with Joseph Jacks and Patrick Reilly from Kismatic Inc, a company offering enterprise Kubernetes support, and asked about their thoughts on the recent Kubernetes v1.0 launch, the history of the project, and how this container orchestration platform may impact the future of microservice deployment.

Read More Read More

New Tricks with Dynamic Proxies in Java 8 (part 2)

July 14, 2015 | Software Consultancy

New Tricks with Dynamic Proxies in Java 8 (part 2)

Building simple proxies

In the previous post I introduced Java dynamic proxies, and sketched out a way they could be used in testing to simplify the generation of custom Hamcrest matchers. In this post, I’m going to dive into some techniques for implementing proxies in Java 8. We’ll start with a simple case, and build up towards something more complex and full-featured.

Read More Read More

OpenCredo and Container Solutions Partner to Deliver Emerging Technologies
Test Automation Framework – Quick start

November 4, 2014 | Software Consultancy

Test Automation Framework – Quick start

When starting a project, teams often spend their time re-inventing the ‘automated testing wheel’.  While every project has it’s own challenges and every team it’s own needs, many things exist as common requirements of a flexible test automation framework.

This post introduces an effective Java test framework that can be used to quickly get started with test automation on a Java project.

Read More Read More

October 23, 2014 | Cassandra

Spring Data Cassandra Overview

Spring Data Cassandra (SDC) is a community project under the Spring Data (SD) umbrella that provides convenient and familiar APIs to work with Apache Cassandra.

Read More Read More

New features in Cassandra 2.0 – Lightweight Transactions on Update

February 17, 2014 | Cassandra

New features in Cassandra 2.0 – Lightweight Transactions on Update

In our previous posts we gave an overview of Cassandra’s new compare-and-set (lightweight transaction) commands and a more detailed look into the API for using them when inserting new rows into the database.

In this third post, we are going to cover update statements. We recommend reading the previous posts, as there are some details which are the same for inserts and updates which are not repeated here.

Read More Read More

Withstanding the test of time

December 18, 2012 | Software Consultancy

Withstanding the test of time

The first thing most people think of when they start a project with the good intentions of test driven development is: write a test first. That’s great, and something I would fully encourage. However, diving in to writing tests without forethought, especially on large projects with a lot of developers can lead to new problems that TDD is not going to solve. With some upfront thinking (but not big upfront design!) a large team can avoid problems later down the line by considering some important and desirable traits of a large and rapidly changing test suite.

Read More Read More

Esper Extensions – Implementing Custom Aggregation Function

March 21, 2012 | Software Consultancy

Esper Extensions – Implementing Custom Aggregation Function

Event processing Language (EPL) enables us to write complex queries to get the most out of our event stream in real time, using SQL-like syntax.

EPL allows us to use full power of aggregation of the high volume event stream to get required results with the minimal latency. In this blog we are going to explore some aspects of numerical aggregation of data with high precision BigDecimal values. We will also demonstrate how you can add you own aggregation function to Esper engine and use them in EPL statements.

Read More Read More