Open Credo

15 items found: Search results for "data science" in all categories x

Yow! London – Searching for Research Fraud in OpenAlex with Graph Data Science (Recording)

September 19, 2023 | Blog, Data Analysis

Yow! London – Searching for Research Fraud in OpenAlex with Graph Data Science (Recording)

Check out our Lead Consultant Ebru Cucen and Sage Publishing Data Scientist Adam Day co-present on “Research Fraud Detection in OpenAlex with Graph Data Science” at YOW! London 2022.

Read More Read More

Data Science on Steroids: Productionised Machine Learning as a Value Driver for Business

July 31, 2018 | Machine Learning

Data Science on Steroids: Productionised Machine Learning as a Value Driver for Business

Machine Learning, alongside a mature Data Science, will help to bring IT and business closer together. By leveraging data for actionable insights, IT will increasingly drive business value. Agile and DevOps practices enable the continuous delivery of business value through productionised machine learning models and software delivery.

Read More Read More

OpenCredo at Data Science London Meetup
Seeing Through The Modern Data Stack

December 7, 2022 | Blog, Data Analysis

Seeing Through The Modern Data Stack

Learn more about data processing and analytics, which are essential systems in modern enterprises but are frequently overlooked aspects of modern data architectures, by reading Mateus Pimenta’s latest blog.

Read More Read More

Making Sense of Data with RDF* vs. LPG

January 31, 2022 | Blog, Data Engineering

Making Sense of Data with RDF* vs. LPG

There are two camps of Graph database, one side is RDF, where they are strict with their format, and somewhat limited for their extensibility. The other side is LPG, where they can define labels to the relationships. With its recent extension, RDF now allows users to add properties, thus becoming RDF*. In this blog, Ebru explores the structural and performance differences between LPG and RDF*.

Read More Read More

It’s 2020, can I have my ML models now?

April 2, 2020 | Machine Learning

It’s 2020, can I have my ML models now?

Recent years have seen many companies consolidate all their data into a data lake/warehouse of some sort. Once it’s all consolidated, what next?

Many companies consolidate data with a field of dreams mindset – “build it and they will come”, however a comprehensive data strategy is needed if the ultimate goals of an organisation are to be realised: monetisation through Machine Learning and AI is an oft-cited goal. Unfortunately, before one rushes into the enticing world of machine learning, one should lay more mundane foundations. Indeed, in data science, estimates vary between 50% to 80% of the time taken is devoted to so-called data-wrangling. Further, Google estimates ML projects produce 5% ML code and 95% “glue code”. If this is the reality we face, what foundations are required before one can dive headlong into ML?

Read More Read More

Writing a custom JupyterHub Spawner

January 11, 2018 | Data Engineering

Writing a custom JupyterHub Spawner

The last few years have seen Python emerge as a lingua franca for data scientists. Alongside Python we have also witnessed the rise of Jupyter Notebooks, which are now considered a de facto data science productivity tool, especially in the Python community. Jupyter Notebooks started as a university side-project known as iPython in circa 2001 at UC Berkeley.

Read More Read More

The Seven Deadly Sins of Microservices (Redux)

January 8, 2016 | Microservices

The Seven Deadly Sins of Microservices (Redux)

Many of our clients are in the process of investigating or implementing ‘microservices’, and a popular question we often get asked is “what’s the most common mistake you see when moving towards a microservice architecture?”. We’ve seen plenty of good things with this architectural pattern, but we have also seen a few recurring issues and anti-patterns, which I’m keen to share here.

Read More Read More

OpenCredo is now an Amazon Web Services APN Consulting Partner
Machine Learning at scale: first impressions of Kubeflow

April 20, 2021 | Data Engineering, Machine Learning, Software Consultancy

Machine Learning at scale: first impressions of Kubeflow

Our recent client was a Fintech who had ambitions to build a Machine Learning platform for real-time decision making. The client had significant Kubernetes proficiency, ran on the cloud, and had a strong preference for using free, open-source software over cloud-native offerings that come with lock-in. Several components were spiked with success (feature preparation with Apache Beam and Seldon for model serving performed particularly strongly). Kubeflow was one of the next technologies on our list of spikes, showing significant promise at the research stage and seemingly a good match for our client’s priorities and skills.

That platform slipped down the client’s priority list before completing the research for Kubeflow, so I wanted to see how that project might have turned out. Would Kubeflow have made the cut?

 

Read More Read More

What is Continuous Verification?

October 15, 2020

What is Continuous Verification?

Continuous Verification is a term that is starting to pop up from time-to-time… but what does it mean? Well… according to Nora Jones and Casey Rosenthal, authors of O’Reilly’s Chaos Engineering books,

 

“Continuous verification (CV) is a discipline of proactive experimentation, implemented as tooling that verifies system behaviors. This stands in contrast to prior common practices in software quality assurance, which favor reactive testing, implemented as methodologies that validate known properties of software. This isn’t to say that prior common practices are invalid or should be deprecated. Alerting, testing, code reviews, monitoring, SRE practices, and the like—these are all great practices and should be encouraged”

 

Over the course of this post, we will unpack this statement: to understand what is behind it and what it might mean for your development process.

Read More Read More

Automation of complex IT systems

May 14, 2020 | Blog, DevOps

Automation of complex IT systems

At the time of this post, the UK is making steps to exit from an unprecedented lockdown measures for the Coronavirus. Much of the UK workforce are still making efforts to work-from-home with mainly key workers operating – at risk – in public. Many industries have shut down completely. Consequently, many businesses are reflecting on what happens next and how do we better mitigate future pandemic events?

Read More Read More

A Pragmatic Introduction to Machine Learning for DevOps Engineers

January 23, 2018 | Data Engineering, DevOps

A Pragmatic Introduction to Machine Learning for DevOps Engineers

Machine Learning is a hot topic these days, as can be seen from search trends. It was the success of Deepmind and AlphaGo in 2016 that really brought machine learning to the attention of the wider community and the world at large.

Read More Read More

From Java to Go, and Back Again

October 13, 2016 | Data Analysis

From Java to Go, and Back Again

In Lisp, you don’t just write your program down toward the language, you also build the language up toward your program. As you’re writing a program you may think “I wish Lisp had such-and-such an operator.” So you go and write it. Afterward you realize that using the new operator would simplify the design of another part of the program, and so on. Language and program evolve together…In the end your program will look as if the language had been designed for it. And when language and program fit one another well, you end up with code which is clear, small, and efficient – Paul Graham, Programming Bottom-Up

Read More Read More

Model Matters: Graphs, Neo4j and the Future

February 25, 2013 | Neo4j

Model Matters: Graphs, Neo4j and the Future

As part of our work, we often help our customers choose the right datastore for a project. There are usually a number of considerations involved in that process, such as performance, scalability, the expected size of the data set, and the suitability of the data model to the problem at hand.

This blog post is about my experience with graph database technologies, specifically Neo4j. I would like to share some thoughts on when Neo4j is a good fit but also what challenges Neo4j faces now and in the near future.

Read More Read More