18 items found: Search results for "python" in all categories x
August 5, 2015 | Data Analysis, Data Engineering
A few weeks ago, we thought about building a Google analytics dashboard to give us easy access to certain elements of our Google Analytics web traffic. We saw some custom dashboards for bloggers, but nothing quite right for our goal, since we wanted the data on a big screen for everyone in the office to view.
February 16, 2023 | Blog, Data Analysis, Neo4j
Check out Part 2 of Ebru Cucen and Fahran Wallace’s blog series, in which they discuss their experience ingesting 400 million nodes and a billion relationships into Neo4j and what they discovered along the way.
April 20, 2021 | Data Engineering, Machine Learning, Software Consultancy
Our recent client was a Fintech who had ambitions to build a Machine Learning platform for real-time decision making. The client had significant Kubernetes proficiency, ran on the cloud, and had a strong preference for using free, open-source software over cloud-native offerings that come with lock-in. Several components were spiked with success (feature preparation with Apache Beam and Seldon for model serving performed particularly strongly). Kubeflow was one of the next technologies on our list of spikes, showing significant promise at the research stage and seemingly a good match for our client’s priorities and skills.
That platform slipped down the client’s priority list before completing the research for Kubeflow, so I wanted to see how that project might have turned out. Would Kubeflow have made the cut?
September 22, 2020 | AWS, Blog, Cassandra, Cloud, DevOps, Open Source
With the upcoming Cassandra 4.0 release, there is a lot to look forward to. Most excitingly, and following a refreshing realignment of the Open Source community around Cassandra, the next release promises to focus on fundamentals: stability, repair, observability, performance and scaling.
We must set this against the fact that Cassandra ranks pretty highly in the Stack Overflow most dreaded databases list and the reality that Cassandra is expensive to configure, operate and maintain. Finding people who have the prerequisite skills to do so is challenging.
February 14, 2018 | Cloud
AWS Announced a few new products for use with containers at RE:Invent 2017 and of particular interest to me was a new Elastic Container Service(ECS) Launch type, called Fargate
Prior to Fargate, when it came to creating a continuous delivery pipeline in AWS, the use of containers through ECS in its standard form, was the closest you could get to an always up, hands off, managed style of setup. Traditionally ECS has allowed you to create a configured pool of “worker” instances, with it then acting as a scheduler, provisioning containers on those instances.
January 23, 2018 | Data Engineering, DevOps
Machine Learning is a hot topic these days, as can be seen from search trends. It was the success of Deepmind and AlphaGo in 2016 that really brought machine learning to the attention of the wider community and the world at large.
January 11, 2018 | Data Engineering
The last few years have seen Python emerge as a lingua franca for data scientists. Alongside Python we have also witnessed the rise of Jupyter Notebooks, which are now considered a de facto data science productivity tool, especially in the Python community. Jupyter Notebooks started as a university side-project known as iPython in circa 2001 at UC Berkeley.
May 2, 2017 | Cassandra, Data Engineering
My recent blogpost I explored a few cases where using Cassandra and Spark together can be useful. My focus was on the functional behaviour of such a stack and what you need to do as a developer to interact with it. However, it did not describe any details about the infrastructure setup that is capable of running such Spark code or any deployment considerations. In this post, I will explore this in more detail and show some practical advice in how to deploy Spark and Apache Cassandra.
March 23, 2017 | Cassandra, Data Analysis, Data Engineering
In recent years, Cassandra has become one of the most widely used NoSQL databases: many of our clients use Cassandra for a variety of different purposes. This is no accident as it is a great datastore with nice scalability and performance characteristics.
However, adopting Cassandra as a single, one size fits all database has several downsides. The partitioned/distributed data storage model makes it difficult (and often very inefficient) to do certain types of queries or data analytics that are much more straightforward in a relational database.
January 23, 2017 | Data Analysis
More often than not, people who write Go have some sort of opinion on its error handling model. Depending on your experience with other languages, you may be used to different approaches. That’s why I’ve decided to write this article, as despite being relatively opinionated, I think drawing on my experiences can be useful in the debate. The main issues I wanted to cover are that it is difficult to force good error handling practice, that errors don’t have stack traces, and that error handling itself is too verbose.
August 26, 2016 | Kubernetes
This post is the second of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible. To understand the goal of the project, you’d better start from the first part.
August 26, 2016 | Kubernetes
This post is the first of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible.
May 31, 2016 | Kubernetes
Do you ever wake up and think to yourself: oh geez, Kubernetes is awesome, but I wish I could browse and edit my services and replication controllers using the file system? No? Well, in any case, this is now possible.
September 20, 2015 | Microservices
Over the past five years I have worked within several projects that used a ‘microservice’-based architecture, and one constant issue I have encountered is the absence of standardised patterns for local development and ‘off the shelf’ development tooling that support this. When working with monoliths we have become quite adept at streamlining the development, build, test and deploy cycles. Development tooling to help with these processes is also readily available (and often integrated with our IDEs). For example, many platforms provide ‘hot reloading’ for viewing the effects of code changes in near-real time, automated execution of tests, regular local feedback from continuous integration servers, and tooling to enable the creation of a local environment that mimics the production stack.
September 14, 2015 | DevOps, Hashicorp, Open Source
Recently I was working on a project that was using SaltStack for configuration management and Consul for service discovery. It occurred to me that using Consul’s key/value store would be great place to store data needed for my Salt runs, but unfortunately Consul was not supported in SaltStack as an official data store at that point in time. Being an open source project however, this provided an excellent opportunity to contribute back and this blog post looks to provide some details on how this works, as well as a practical demo on how you can take advantage of Consul as an external data store.
August 5, 2015 | Cloud, GCP, Kubernetes
Why OpenCredo partnered with Google
Recently OpenCredo chose to partner with Google in order to share knowledge and resources around the Google Cloud Platform offerings. Our clients come in many shapes and sizes, but typically all of them realise three disruptive truths of the modern IT industry: the (economic) value of cloud; the competitive advantage of continuous delivery; and the potential of hypothesis and data-driven product development to increase innovation (as popularised by the Lean Startup / Lean Enterprise motto of ‘build, measure, learn’).
July 28, 2015 | Microservices
Recently I co-presented a talk at Goto Amsterdam on lessons learnt whilst developing with a Microservices architecture; one being the importance of defining and documenting your API contracts as early as possible in the development cycle. During the talk I mentioned a few API documentation tools that I’d used and, based on feedback and questions from attendees, I realised that this topic merited a blog post. So, the purpose of this is to introduce 5 tools which help with designing, testing and documenting APIs.
January 10, 2013 | DevOps
Recently I have started looking into SaltStack as a solution that does both config management and orchestration. It is a relatively new project started in 2011, but it has a growing fanbase among Sys Admins and DevOps Engineers. In this blog post I will look into Salt as a promising alternative, and comparing it to Puppet as a way of exploring its basic set of features.