49 items found: Search results for "series" in all categories x
April 25, 2017 | Cassandra, Data Analysis, Data Engineering
Apache Spark is a powerful open source processing engine which is fast becoming our technology of choice for data analytic projects here at OpenCredo. For many years now we have been helping our clients to practically implement and take advantage of various big data technologies including the like of Apache Cassandra amongst others.
August 26, 2016 | Cassandra
At OpenCredo we have been working with Cassandra since 2012 and we are big fans of both open source Apache Cassandra and the capabilities of DataStax Enterprise. Over the years we have collected a great deal of experience throughout the company on how to deliver the benefits of Cassandra in real world projects and have also seen some common pitfalls that businesses have fallen into.
March 9, 2023 | Blog, Data Analysis, Neo4j
Check out the last part of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.
February 16, 2023 | Blog, Data Analysis, Neo4j
Check out Part 2 of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.
January 26, 2023 | Blog, Data Analysis, Neo4j
Fahran Wallace and Ebru Cucen’s most recent blog post is part 1 of a three-part series. They investigate how OpenCredo ingested 400 million nodes with a billion relationships into Neo4j.
May 26, 2022
Our CEO/CTO Nicki Watt shares about her upcoming talk at PlatformCon 2022.
February 17, 2021 | Blog, Cloud, Cloud Native, GCP, Open Source
Multi-cloud is rapidly becoming the cloud strategy of choice for enterprises looking to modernise their applications.
And the reason is simple – it gives them much more flexibility to host their workloads and data where it suits them best.
In this post, we focus on Google’s application modernisation solution Google Anthos and the role it can play in your cloud transformation strategy.
January 11, 2018 | Data Engineering
The last few years have seen Python emerge as a lingua franca for data scientists. Alongside Python we have also witnessed the rise of Jupyter Notebooks, which are now considered a de facto data science productivity tool, especially in the Python community. Jupyter Notebooks started as a university side-project known as iPython in circa 2001 at UC Berkeley.
August 8, 2017 | Cassandra
Recently, the sad news has emerged that Basho, which developed the Riak distributed database, has gone into receivership. This would appear to present a problem for those who have adopted the commercial version of the Riak database (Riak KV) supported by Basho.
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.
May 22, 2017 | Data Analysis
As a final piece of our recent blog series about Apache Spark on 16 May we have presented details of a use-case about using Spark Structured Streaming to generate real-time alerts of suspicious activity in an AWS-based infrastructure.
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.
May 9, 2017 | Cassandra
Data analytics isn’t a field commonly associated with testing, but there’s no reason we can’t treat it like any other application. Data analytics services are often deployed in production, and production services should be properly tested. This post covers some basic approaches for the testing of Cassandra/Spark code. There will be some code examples, but the focus is on how to structure your code to ensure it is testable!
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.
May 2, 2017 | Cassandra, Data Engineering
My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.
March 23, 2017 | Cassandra, Data Analysis, Data Engineering
In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.
However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.
March 23, 2017 | Data Engineering, Machine Learning
On previous blog posts we have provided examples of different types of acceptance tests coverage, UI, API and Performance. One area where automation is often lacking is around validating the security of the application under test. This has been discussed in the post on non functional testing You Are Ignoring Non-functional Testing. With this post we will enhance the automation framework to quickly check for some common security flaws.
January 26, 2017 | Data Engineering
Suppose you are given the task of writing code that fulfils the following contract:
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.
January 25, 2017 | Cassandra
One of the simplest and best-understood models of computation is the Finite State Machine (FSM). An FSM has fixed range of states it can be in, and is always in one of these states. When an input arrives, this triggers a transition in the FSM from its current state to the next state. There may be several possible transitions to several different states, and which transition is chosen depends on the input.
October 10, 2016 | Cassandra
In the culmination of our blog series on the topic, on October 6th 2016 OpenCredo Consultants Dominic Fox, Alla Babkina and Guy Richardson, and hosted by Marco Cullen, presented the common design and implementation issues that they have come across in real-world Apache Cassandra deployments.
September 27, 2016 | Cassandra, Data Engineering
If there is one thing to understand about Cassandra, it is the fact that it is optimised for writes. In Cassandra everything is a write including logical deletion of data which results in tombstones – special deletion records. We have noticed that lack of understanding of tombstones is often the root cause of production issues our clients experience with Cassandra. We have decided to share a compilation of the most common problems with Cassandra tombstones and some practical advice on solving them.
September 15, 2016 | Cassandra
Cassandra isn’t a relational database management system, but it has some features that make it look a bit like one. Chief among these is CQL, a query language with an SQL-like syntax. CQL isn’t a bad thing in itself – in fact it’s very convenient – but it can be misleading since it gives developers the illusion that they are working with a familiar data model, when things are really very different under the hood.
September 6, 2016 | Cassandra
A growing number of clients are asking OpenCredo for help with using Apache Cassandra and solving specific problems they encounter. Clients have different use cases, requirements, implementation and teams but experience similar issues. We have noticed that Cassandra data modelling problems are the most consistent cause of Cassandra failing to meet their expectations. Data modelling is one of the most complex areas of using Cassandra and has many considerations.
August 26, 2016 | Kubernetes
This post is the first of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible.
August 24, 2016 | Cassandra
At OpenCredo we are seeing an increase in adoption of Apache Cassandra as a leading NoSQL database for managing large data volumes, but we have also seen many clients experiencing difficulty converting their high expectations into operational Cassandra performance. Here we present a high-level technical overview of the major strengths and limitations of Cassandra that we have observed over the last few years while helping our clients resolve the real-world issues that they have experienced.
May 10, 2016 | Data Engineering, White Paper
In this technical report, we present Concursus, a framework for developing distributed applications using CQRS and event sourcing patterns within a modern, Java 8-centric, programming model. Following a high-level survey of the trends leading towards the adoption of these patterns, we show how Concursus simplifies the task of programming event sourcing applications by providing a concise, intuitive API to systems composed of event processing middleware.
April 29, 2016 | Software Consultancy
In this post, I’ll demonstrate an alternative API which uses some of the advanced language features of the new Kotlin language from Jetbrains. As Kotlin is a JVM-based language, it interoperates seamlessly with Concursus’s Java 8 classes; however, it also offers powerful ways to extend their functionality.
April 28, 2016 | Software Consultancy
In a conventional RDBMS-with-ORM system, we are used to thinking of domain objects as mapped to rows in database tables, and of the database as a repository where the current state of every object exists simultaneously, so that what we get when we query for an object is the state that object was in at the time the query was issued. To perform an update, we can start a transaction, retrieve the current state of the object, modify it, save it back again and commit. Transactions move the global state of the system from one consistent state to another, so that the database transaction log represents a single, linear history of updates. We are therefore able to have a very stable, intuitive sense of what it means to talk about the “current state” of any domain object.
March 29, 2016 | Software Consultancy
Acceptance test suites generally are used for UI and API testing, and we have covered both these approaches in our Test Automation Quickstart project. However, an application may, for example, send registration or expiration warning emails. Often, tests related to this are left to manual testing, instead of putting them into an automated test suite.
However, there’s no need to check emails manually: it suffers from all the same problems as other manual testing. It’s slow, expensive, and inconsistent. There are many libraries available to interact with email through code – this post will focus on how to use them within an automated test suite.
March 14, 2016 | Software Consultancy
Test automation provides fast feedback on regressions. In order to achieve this tests need to execute quickly, something which becomes more of a problem as test suites grow. This is especially true of tests which exercise a user interface where the interaction with the system is slower.
A good way to address this is to have your tests execute in parallel rather than consecutively. Given sufficient resources this allows your execution time to remain low almost indefinitely as more scenarios are added to the suite.
March 3, 2016 | Software Consultancy
In this post, I’ll be sharing some React/Redux boilerplate code that Vince Martinez and I have been developing recently. It’s primarily aimed at developers who are familiar with the React ecosystem, so if you are new to React and/or Redux, you might like to have a look at Getting Started with React and Getting Started with Redux.
March 3, 2016 | Software Consultancy
JetBrains (the people behind IntelliJ IDEA) have recently announced the first RC for version 1.0 of Kotlin, a new programming language for the JVM. I say ‘new’, but Kotlin has been in the making for a few years now, and has been used by JetBrains to develop several of their products, including Intellij IDEA. The company open-sourced Kotlin in 2011, and have worked with the community since then to make the language what it is today.
January 26, 2016 | Data Engineering
In this second post about Hazelcast and Spring, I’m integrating Hazelcast and Spring-managed transaction for a specific use case: A transactional Queue. More specifically, I want to make the message polling, of my sample chat application, transactional.
January 18, 2016 | Software Consultancy
Last time in this series I summarised all the Akka Persistence related improvements in Akka 2.4. Since then Akka 2.4.1 has been released with some additional bug fixes and improvements so perhaps now is a perfect time to pick up this mini-series and introduce some other new features included in Akka 2.4.x.
January 8, 2016 | Microservices
Many of our clients are in the process of investigating or implementing ‘microservices’, and a popular question we often get asked is “what’s the most common mistake you see when moving towards a microservice architecture?”. We’ve seen plenty of good things with this architectural pattern, but we have also seen a few recurring issues and anti-patterns, which I’m keen to share here.
December 1, 2015 | Software Consultancy
This post introduce some of the basic features of Hazelcast, some of its limitations, how to embed it in a Spring Boot application and write integration testings. This post is intended to be the first of a series about Hazelcast and its integration with Spring (Boot). Let’s start from the basics.
November 3, 2015 | Software Consultancy
My JavaOne experience was rather busy this year, what with three talks presented in a single day! The first of these talks “Debugging Java Apps in Containers: No Heavy Welding Gear Required” was delivered with my regular co-presenter Steve Poole, from IBM, and we shared our combined experiences of working with Java and Docker over the past year.
October 1, 2015 | Data Engineering
Akka Streams, the new experimental module under Akka project has been finally released in July after some months of development and several milestone and RC versions. In this series I hope to gently introduce concepts from the library and demonstrate how it can be used to address real-life stream processing challenges.
Akka Streams is an implementation of the Reactive Streams specification on top of Akka toolkit that uses actor based concurrency model. Reactive Streams specification has been created by the number of companies interested in asynchronous, non-blocking, event based data processing that can span across system boundaries and technology stacks.
September 24, 2015 | Microservices
Unless you’ve been living under a (COBOL-based) rock for the last few years, you will have no doubt heard of the emerging trend of microservices. This approach to developing ‘loosely coupled service-oriented architecture with bounded contexts’ has captured the hearts and minds of many developers. The promise of easier enforcement of good architectural and design principles, such as encapsulation and interface segregation, combined with the availability to experiment with different languages and platforms for each service, is a (developer) match made in heaven.
September 20, 2015 | Microservices
Over the past five years I have worked within several projects that used a ‘microservice’-based architecture, and one constant issue I have encountered is the absence of standardised patterns for local development and ‘off the shelf’ development tooling that support this. When working with monoliths we have become quite adept at streamlining the development, build, test and deploy cycles. Development tooling to help with these processes is also readily available (and often integrated with our IDEs). For example, many platforms provide ‘hot reloading’ for viewing the effects of code changes in near-real time, automated execution of tests, regular local feedback from continuous integration servers, and tooling to enable the creation of a local environment that mimics the production stack.
September 13, 2015 | DevOps
Last week I was privileged to be able to present my “Thinking Fast and Slow with Software Development” talk at the inaugural Software Circus conference in Amsterdam. The conference was amazing, and I’ll write more about this later, but in this post I was keen to share the presentation slides and the thinking behind this talk…
August 26, 2015 | Cloud
Unless you’ve been living under a rock for the last year, you’ll undoubtedly know that microservices are the new hotness. An emerging trend that I’ve observed is that the people who are actually using microservices in production tend to be the larger well-funded companies, such as Netflix, Gilt, Yelp, Hailo etc., and each organisation has their own way of developing, building and deploying.
August 18, 2015 | Software Consultancy
In this post, the last in the New Tricks With Dynamic Proxies series (see part 1 and part 2), I’m going to look at using dynamic proxies to create bean-like value objects to represent records. The basic idea here is to have some untyped storage for a collection of property values, such as an array of Object
s, and a typed wrapper around that storage which provides a convenient and type-safe access mechanism. A dynamic proxy is used to convert calls on getter and setter methods in the wrapper interface into calls which read and write values in the store.
August 7, 2015 | Kubernetes
Learning about the benefits of Kubernetes from the Kismatic Team
As part of my writing for InfoQ, I recently had the pleasure of sitting down and chatting with Joseph Jacks and Patrick Reilly from Kismatic Inc, a company offering enterprise Kubernetes support, and asked about their thoughts on the recent Kubernetes v1.0 launch, the history of the project, and how this container orchestration platform may impact the future of microservice deployment.
July 14, 2015 | Software Consultancy
Building simple proxies
In the previous post I introduced Java dynamic proxies, and sketched out a way they could be used in testing to simplify the generation of custom Hamcrest matchers. In this post, I’m going to dive into some techniques for implementing proxies in Java 8. We’ll start with a simple case, and build up towards something more complex and full-featured.
November 4, 2014 | Software Consultancy
When starting a project, teams often spend their time re-inventing the ‘automated testing wheel’. While every project has it’s own challenges and every team it’s own needs, many things exist as common requirements of a flexible test automation framework.
This post introduces an effective Java test framework that can be used to quickly get started with test automation on a Java project.
October 23, 2014 | Cassandra
Spring Data Cassandra (SDC) is a community project under the Spring Data (SD) umbrella that provides convenient and familiar APIs to work with Apache Cassandra.
February 17, 2014 | Cassandra
In our previous posts we gave an overview of Cassandra’s new compare-and-set (lightweight transaction) commands and a more detailed look into the API for using them when inserting new rows into the database.
In this third post, we are going to cover update statements. We recommend reading the previous posts, as there are some details which are the same for inserts and updates which are not repeated here.
December 18, 2012 | Software Consultancy
The first thing most people think of when they start a project with the good intentions of test driven development is: write a test first. That’s great, and something I would fully encourage. However, diving in to writing tests without forethought, especially on large projects with a lot of developers can lead to new problems that TDD is not going to solve. With some upfront thinking (but not big upfront design!) a large team can avoid problems later down the line by considering some important and desirable traits of a large and rapidly changing test suite.
March 21, 2012 | Software Consultancy
Event processing Language (EPL) enables us to write complex queries to get the most out of our event stream in real time, using SQL-like syntax.
EPL allows us to use full power of aggregation of the high volume event stream to get required results with the minimal latency. In this blog we are going to explore some aspects of numerical aggregation of data with high precision BigDecimal values. We will also demonstrate how you can add you own aggregation function to Esper engine and use them in EPL statements.