Search results for
"series"

62 items found: Search results for "series" in all categories x

April 25, 2017 | Cassandra, Data Analysis, Data Engineering

New Blog Series: Spark – The Pragmatic Bits

Apache Spark is a powerful open source processing engine which is fast becoming our technology of choice for data analytic projects here at OpenCredo. For many years now we have been helping our clients to practically implement and take advantage of various big data technologies including the like of Apache Cassandra amongst others.

August 26, 2016 | Cassandra

New Blog Series: Cassandra – What You May Learn the Hard Way

At OpenCredo we have been working with Cassandra since 2012 and we are big fans of both open source Apache Cassandra and the capabilities of DataStax Enterprise. Over the years we have collected a great deal of experience throughout the company on how to deliver the benefits of Cassandra in real world projects and have also seen some common pitfalls that businesses have fallen into.

News | October 22, 2012

Webinar series on MongoDB within Financial Services

March 5, 2024 | Blog, Culture, News

The 2023 Mayor’s Business Climate Challenge (BCC) – Final Part

Learn more about our efforts and our progress towards becoming an environmentally friendly company for the Mayor’s Business Climate Challenge (BCC) in 2023 in this final update.

March 9, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 3

Check out the last part of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.

February 16, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 2

Check out Part 2 of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.

January 26, 2023 | Blog, Data Analysis, Neo4j

Ingesting Big Data into Neo4j – Part 1

Fahran Wallace and Ebru Cucen’s most recent blog post is part 1 of a three-part series. They investigate how OpenCredo ingested 400 million nodes with a billion relationships into Neo4j.

May 26, 2022

PlatformCon 2022: People, Process, and Platform – a community-focused approach

Our CEO/CTO Nicki Watt shares about her upcoming talk at PlatformCon 2022.

February 17, 2021 | Blog, Cloud, Cloud Native, GCP, Open Source

Anthos – A Holistic Approach to your Hybrid Cloud initiative

Multi-cloud is rapidly becoming the cloud strategy of choice for enterprises looking to modernise their applications.

And the reason is simple – it gives them much more flexibility to host their workloads and data where it suits them best.

In this post, we focus on Google’s application modernisation solution Google Anthos and the role it can play in your cloud transformation strategy.

[Past event] Applied Data Engineering Meet Up #6

Join us for the Applied Data Engineering Meet Up #6 on the 18th of April! For our first meet up of the year we have CTO of Humio, Kresten Thorup joining us and speaking about Data Processing and SecOps at Scale.

View All Past Events

January 11, 2018 | Data Engineering

Writing a custom JupyterHub Spawner

The last few years have seen Python emerge as a lingua franca for data scientists. Alongside Python we have also witnessed the rise of Jupyter Notebooks, which are now considered a de facto data science productivity tool, especially in the Python community. Jupyter Notebooks started as a university side-project known as iPython in circa 2001 at UC Berkeley.

August 8, 2017 | Cassandra

Riak, the Dynamo paper and life beyond Basho

Recently, the sad news has emerged that Basho, which developed the Riak distributed database, has gone into receivership. This would appear to present a problem for those who have adopted the commercial version of the Riak database (Riak KV) supported by Basho.

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

May 22, 2017 | Data Analysis

Detecting stolen AWS credential usage with Apache Spark – Webinar Recording

As a final piece of our recent blog series about Apache Spark on 16 May we have presented details of a use-case about using Spark Structured Streaming to generate real-time alerts of suspicious activity in an AWS-based infrastructure.

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

[Past event] Webinar: Detecting stolen AWS credential usage with Apache Spark

Join us as we conclude our recent Apache Spark series with a webinar that will explore the use case of “Detecting stolen AWS credential usage with Spark”

View All Past Events

May 9, 2017 | Cassandra

Testing a Spark Application

Data analytics isn’t a field commonly associated with testing, but there’s no reason we can’t treat it like any other application. Data analytics services are often deployed in production, and production services should be properly tested. This post covers some basic approaches for the testing of Cassandra/Spark code. There will be some code examples, but the focus is on how to structure your code to ensure it is testable!

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

May 2, 2017 | Cassandra, Data Engineering

Deploy Spark with an Apache Cassandra cluster

My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.

March 23, 2017 | Cassandra, Data Analysis, Data Engineering

Data Analytics using Cassandra and Spark

In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.

However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.

March 23, 2017 | Data Engineering, Machine Learning

Automating Your Security Acceptance Tests

On previous blog posts we have provided examples of different types of acceptance tests coverage, UI, API and Performance. One area where automation is often lacking is around validating the security of the application under test. This has been discussed in the post on non functional testing You Are Ignoring Non-functional Testing. With this post we will enhance the automation framework to quickly check for some common security flaws.

[Past event] Voxxed Days Bristol 2017

Join Lorenzo Nicora at Voxxed Days Bristol 2017 for his talk on Event Sourcing and CQRS! Voxxed Days is a series of tech events organised by local community groups where local and international speakers converge at a wide range of locations around the world. This means each event retains a unique regional flavour, whilst being part of the overall Voxxed movement. Topics covered at Voxxed Days fall under the same radar as Voxxed.com, including: Server Side Java, Java Language, Cloud and Big Data, Web & HTML, Mobile, Programming Languages, Architecture & Security, Methodology, Culture and Future Technologies.

View All Past Events

January 26, 2017 | Data Engineering

Reactive event processing with Reactor Core: a first look

Suppose you are given the task of writing code that fulfils the following contract:

You will be given a promise that, at some point in the future, some data – a series of values – will become available.
In return, you will supply a promise that, at some point in the future, some data representing the results of processing that data will become available.
There may be more values to process than you can fit in memory, or even an infinite series of values.
You are allowed to specify what will be done with each individual value, as and when it becomes available; this includes discarding some values.
Whenever you want to use some external service to do something with a value, that service can only return you a promise that, at some point in the future, some data representing the result of processing that value will become available.

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

January 25, 2017 | Cassandra

The Three ‘R’s of Distributed Event Processing

One of the simplest and best-understood models of computation is the Finite State Machine (FSM). An FSM has fixed range of states it can be in, and is always in one of these states. When an input arrives, this triggers a transition in the FSM from its current state to the next state. There may be several possible transitions to several different states, and which transition is chosen depends on the input.

[Past event] O’Reilly Software Architecture Conference 2016

Join Daniel at O’RIELLY’s Software Architecture Conference 2016 for his talk ” A Practical Guide for Continuous Delivery with Containers.”

View All Past Events

[Past event] O’REILLY: OSCON 2016

Join Daniel Bryant at O’REILLY’S everything open source conference, OSCON 2016 for his talk “The Seven (More) Deadly Sins of Microservices.”

View All Past Events

October 10, 2016 | Cassandra

Cassandra – The Good, The Bad and the Ugly Webinar Recording

In the culmination of our blog series on the topic, on October 6th 2016 OpenCredo Consultants Dominic Fox, Alla Babkina and Guy Richardson, and hosted by Marco Cullen, presented the common design and implementation issues that they have come across in real-world Apache Cassandra deployments.

[Past event] Webinar: Cassandra – The Good, the Bad and the Ugly

Hear from OpenCredo’s experts in this live Webinar working through best practise and common sense approaches to building out a successful Cassandra cluster, gained through experience working with clients in designing and deploying Cassandra across a wide range of business domains. Learn from them how you can make the very best of your cluster in a real world setting. […]

View All Past Events

September 27, 2016 | Cassandra, Data Engineering

Common Problems with Cassandra Tombstones

If there is one thing to understand about Cassandra, it is the fact that it is optimised for writes. In Cassandra everything is a write including logical deletion of data which results in tombstones – special deletion records. We have noticed that lack of understanding of tombstones is often the root cause of production issues our clients experience with Cassandra. We have decided to share a compilation of the most common problems with Cassandra tombstones and some practical advice on solving them.

September 15, 2016 | Cassandra

How Not To Use Cassandra Like An RDBMS (and what will happen if you do)

Cassandra isn’t a relational database management system, but it has some features that make it look a bit like one. Chief among these is CQL, a query language with an SQL-like syntax. CQL isn’t a bad thing in itself – in fact it’s very convenient – but it can be misleading since it gives developers the illusion that they are working with a familiar data model, when things are really very different under the hood.

[Past event] ThingMonk 2016

Running from the 12th-14th of September, ThingMonk brings together technologists and designers building core infrastructure for IoT for 2 days of great talks by industry practitioners. Join Tareq and Dominic Fox at ThingMonk 2016 and hear them talk about the event sourcing framework, Concursus!

View All Past Events

September 6, 2016 | Cassandra

Patterns of Successful Cassandra Data Modelling

A growing number of clients are asking OpenCredo for help with using Apache Cassandra and solving specific problems they encounter. Clients have different use cases, requirements, implementation and teams but experience similar issues. We have noticed that Cassandra data modelling problems are the most consistent cause of Cassandra failing to meet their expectations. Data modelling is one of the most complex areas of using Cassandra and has many considerations.

August 26, 2016 | Kubernetes

Kubernetes from scratch to AWS with Terraform and Ansible (part 1)

This post is the first of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible.

August 24, 2016 | Cassandra

Fulfilling the promise of Apache Cassandra performance

At OpenCredo we are seeing an increase in adoption of Apache Cassandra as a leading NoSQL database for managing large data volumes, but we have also seen many clients experiencing difficulty converting their high expectations into operational Cassandra performance. Here we present a high-level technical overview of the major strengths and limitations of Cassandra that we have observed over the last few years while helping our clients resolve the real-world issues that they have experienced.

May 10, 2016 | Data Engineering, White Paper

Concursus: Event Sourcing for the Internet of Things

In this technical report, we present Concursus, a framework for developing distributed applications using CQRS and event sourcing patterns within a modern, Java 8-centric, programming model. Following a high-level survey of the trends leading towards the adoption of these patterns, we show how Concursus simplifies the task of programming event sourcing applications by providing a concise, intuitive API to systems composed of event processing middleware.

April 29, 2016 | Software Consultancy

The Concursus Programming Model: Kotlin

In this post, I’ll demonstrate an alternative API which uses some of the advanced language features of the new Kotlin language from Jetbrains. As Kotlin is a JVM-based language, it interoperates seamlessly with Concursus’s Java 8 classes; however, it also offers powerful ways to extend their functionality.

April 28, 2016 | Software Consultancy

The Concursus Programming Model: State

In a conventional RDBMS-with-ORM system, we are used to thinking of domain objects as mapped to rows in database tables, and of the database as a repository where the current state of every object exists simultaneously, so that what we get when we query for an object is the state that object was in at the time the query was issued. To perform an update, we can start a transaction, retrieve the current state of the object, modify it, save it back again and commit. Transactions move the global state of the system from one consistent state to another, so that the database transaction log represents a single, linear history of updates. We are therefore able to have a very stable, intuitive sense of what it means to talk about the “current state” of any domain object.

March 29, 2016 | Software Consultancy

Test Automation Concepts – Automated email testing

Raise your test coverage with automated email testing

Acceptance test suites generally are used for UI and API testing, and we have covered both these approaches in our Test Automation Quickstart project. However, an application may, for example, send registration or expiration warning emails. Often, tests related to this are left to manual testing, instead of putting them into an automated test suite.

However, there’s no need to check emails manually: it suffers from all the same problems as other manual testing. It’s slow, expensive, and inconsistent. There are many libraries available to interact with email through code – this post will focus on how to use them within an automated test suite.

March 14, 2016 | Software Consultancy

Test Automation Concepts – Parallel test execution

Test automation provides fast feedback on regressions. In order to achieve this tests need to execute quickly, something which becomes more of a problem as test suites grow. This is especially true of tests which exercise a user interface where the interaction with the system is slower.

A good way to address this is to have your tests execute in parallel rather than consecutively. Given sufficient resources this allows your execution time to remain low almost indefinitely as more scenarios are added to the suite.

March 3, 2016 | Software Consultancy

React/Redux boilerplate

In this post, I’ll be sharing some React/Redux boilerplate code that Vince Martinez and I have been developing recently. It’s primarily aimed at developers who are familiar with the React ecosystem, so if you are new to React and/or Redux, you might like to have a look at Getting Started with React and Getting Started with Redux.

March 3, 2016 | Software Consultancy

Kotlin: a new JVM language you should try

JetBrains (the people behind IntelliJ IDEA) have recently announced the first RC for version 1.0 of Kotlin, a new programming language for the JVM. I say ‘new’, but Kotlin has been in the making for a few years now, and has been used by JetBrains to develop several of their products, including Intellij IDEA. The company open-sourced Kotlin in 2011, and have worked with the community since then to make the language what it is today.

January 26, 2016 | Data Engineering

Hazelcast and Spring-managed Transactions: A Sample Integration

In this second post about Hazelcast and Spring, I’m integrating Hazelcast and Spring-managed transaction for a specific use case: A transactional Queue. More specifically, I want to make the message polling, of my sample chat application, transactional.

January 18, 2016 | Software Consultancy

Akka Typed brings type safety to Akka framework

Last time in this series I summarised all the Akka Persistence related improvements in Akka 2.4. Since then Akka 2.4.1 has been released with some additional bug fixes and improvements so perhaps now is a perfect time to pick up this mini-series and introduce some other new features included in Akka 2.4.x.

January 8, 2016 | Microservices

The Seven Deadly Sins of Microservices (Redux)

Many of our clients are in the process of investigating or implementing ‘microservices’, and a popular question we often get asked is “what’s the most common mistake you see when moving towards a microservice architecture?”. We’ve seen plenty of good things with this architectural pattern, but we have also seen a few recurring issues and anti-patterns, which I’m keen to share here.

December 1, 2015 | Software Consultancy

(Spring) Booting Hazelcast

This post introduce some of the basic features of Hazelcast, some of its limitations, how to embed it in a Spring Boot application and write integration testings. This post is intended to be the first of a series about Hazelcast and its integration with Spring (Boot). Let’s start from the basics.

November 3, 2015 | Software Consultancy

JavaOne: Debugging Java Applications Running in Docker

My JavaOne experience was rather busy this year, what with three talks presented in a single day! The first of these talks “Debugging Java Apps in Containers: No Heavy Welding Gear Required” was delivered with my regular co-presenter Steve Poole, from IBM, and we shared our combined experiences of working with Java and Docker over the past year.

October 1, 2015 | Data Engineering

Introduction to Akka Streams – Getting started

Going reactive

Akka Streams, the new experimental module under Akka project has been finally released in July after some months of development and several milestone and RC versions. In this series I hope to gently introduce concepts from the library and demonstrate how it can be used to address real-life stream processing challenges.

Akka Streams is an implementation of the Reactive Streams specification on top of Akka toolkit that uses actor based concurrency model. Reactive Streams specification has been created by the number of companies interested in asynchronous, non-blocking, event based data processing that can span across system boundaries and technology stacks.

September 24, 2015 | Microservices

The Business Behind Microservices Webinar (Video and Slides)

Unless you’ve been living under a (COBOL-based) rock for the last few years, you will have no doubt heard of the emerging trend of microservices. This approach to developing ‘loosely coupled service-oriented architecture with bounded contexts’ has captured the hearts and minds of many developers. The promise of easier enforcement of good architectural and design principles, such as encapsulation and interface segregation, combined with the availability to experiment with different languages and platforms for each service, is a (developer) match made in heaven.

September 20, 2015 | Microservices

Microservice Platforms: Some Assembly [Still] Required. Part Two

Working Locally with Microservices

Over the past five years I have worked within several projects that used a ‘microservice’-based architecture, and one constant issue I have encountered is the absence of standardised patterns for local development and ‘off the shelf’ development tooling that support this. When working with monoliths we have become quite adept at streamlining the development, build, test and deploy cycles. Development tooling to help with these processes is also readily available (and often integrated with our IDEs). For example, many platforms provide ‘hot reloading’ for viewing the effects of code changes in near-real time, automated execution of tests, regular local feedback from continuous integration servers, and tooling to enable the creation of a local environment that mimics the production stack.

September 13, 2015 | DevOps

Software Circus: Thinking Fast and Slow with Software Development

Making Good Decisions within Software

Last week I was privileged to be able to present my “Thinking Fast and Slow with Software Development” talk at the inaugural Software Circus conference in Amsterdam. The conference was amazing, and I’ll write more about this later, but in this post I was keen to share the presentation slides and the thinking behind this talk…

August 26, 2015 | Cloud

Microservice Platforms: Some Assembly [Still] Required. Part One

The challenges of building and deploying microservices

Unless you’ve been living under a rock for the last year, you’ll undoubtedly know that microservices are the new hotness. An emerging trend that I’ve observed is that the people who are actually using microservices in production tend to be the larger well-funded companies, such as Netflix, Gilt, Yelp, Hailo etc., and each organisation has their own way of developing, building and deploying.

August 18, 2015 | Software Consultancy

New Tricks With Dynamic Proxies In Java 8 (part 3)

In this post, the last in the New Tricks With Dynamic Proxies series (see part 1 and part 2), I’m going to look at using dynamic proxies to create bean-like value objects to represent records. The basic idea here is to have some untyped storage for a collection of property values, such as an array of Objects, and a typed wrapper around that storage which provides a convenient and type-safe access mechanism. A dynamic proxy is used to convert calls on getter and setter methods in the wrapper interface into calls which read and write values in the store.

August 7, 2015 | Kubernetes

The Past, Present and Future of Kubernetes

Learning about the benefits of Kubernetes from the Kismatic Team

As part of my writing for InfoQ, I recently had the pleasure of sitting down and chatting with Joseph Jacks and Patrick Reilly from Kismatic Inc, a company offering enterprise Kubernetes support, and asked about their thoughts on the recent Kubernetes v1.0 launch, the history of the project, and how this container orchestration platform may impact the future of microservice deployment.

July 14, 2015 | Software Consultancy

New Tricks with Dynamic Proxies in Java 8 (part 2)

Building simple proxies

In the previous post I introduced Java dynamic proxies, and sketched out a way they could be used in testing to simplify the generation of custom Hamcrest matchers. In this post, I’m going to dive into some techniques for implementing proxies in Java 8. We’ll start with a simple case, and build up towards something more complex and full-featured.

News | March 30, 2015

OpenCredo and Container Solutions Partner to Deliver Emerging Technologies

November 4, 2014 | Software Consultancy

Test Automation Framework – Quick start

When starting a project, teams often spend their time re-inventing the ‘automated testing wheel’. While every project has it’s own challenges and every team it’s own needs, many things exist as common requirements of a flexible test automation framework.

This post introduces an effective Java test framework that can be used to quickly get started with test automation on a Java project.

October 23, 2014 | Cassandra

Spring Data Cassandra Overview

Spring Data Cassandra (SDC) is a community project under the Spring Data (SD) umbrella that provides convenient and familiar APIs to work with Apache Cassandra.

February 17, 2014 | Cassandra

New features in Cassandra 2.0 – Lightweight Transactions on Update

In our previous posts we gave an overview of Cassandra’s new compare-and-set (lightweight transaction) commands and a more detailed look into the API for using them when inserting new rows into the database.

In this third post, we are going to cover update statements. We recommend reading the previous posts, as there are some details which are the same for inserts and updates which are not repeated here.

December 18, 2012 | Software Consultancy

Withstanding the test of time

The first thing most people think of when they start a project with the good intentions of test driven development is: write a test first. That’s great, and something I would fully encourage. However, diving in to writing tests without forethought, especially on large projects with a lot of developers can lead to new problems that TDD is not going to solve. With some upfront thinking (but not big upfront design!) a large team can avoid problems later down the line by considering some important and desirable traits of a large and rapidly changing test suite.

March 21, 2012 | Software Consultancy

Esper Extensions – Implementing Custom Aggregation Function

Event processing Language (EPL) enables us to write complex queries to get the most out of our event stream in real time, using SQL-like syntax.

EPL allows us to use full power of aggregation of the high volume event stream to get required results with the minimal latency. In this blog we are going to explore some aspects of numerical aggregation of data with high precision BigDecimal values. We will also demonstrate how you can add you own aggregation function to Esper engine and use them in EPL statements.

Search results for"series"

Raise your test coverage with automated email testing

Going reactive

Working Locally with Microservices

Making Good Decisions within Software

The challenges of building and deploying microservices

Search results for
"series"