Open Credo

May 26, 2022 | Data Analysis, Data Engineering

Devoxx UK 2022 – Tracing Your Data’s DNA

As data becomes ubiquitous and deeply interconnected, tracing where who or which system that data comes from – its lineage – will create bigger problems and opportunities for us on the horizon. Watch the recording of James Bowkett’s talk from Devoxx UK on ‘Tracing Your Data’s DNA’

WRITTEN BY

James Bowkett

James Bowkett

Lead Consultant

Devoxx UK 2022 – Tracing Your Data’s DNA

Watch the recording of James Bowkett’s talk ‘Tracing Your Data’s DNA’ from the Devoxx Conference 2022

Tracing Your Data’s DNA – James Bowkett, Technical Delivery Director

As data becomes ubiquitous and deeply interconnected, tracing where, who or which system that data comes from – its lineage – will create bigger problems and opportunities for us on the horizon:

  • How can we trust this document/row of data? what is its lineage? where did it come from?
  • If there is a problem with a piece of data, how do we recalculate and publish just the affected data and not the entire dataset?
  • How can we apply modern engineering practices – such as blue-green deployments – to our data estate and data pipelines?

Furthermore, as our data estate becomes ever-more business-critical, it will be important to be able to secure that data from its source system all the way through the estate, using techniques such as field+row level security (aka cell-based security).

In his talk, James uses live demos and coding examples to explore some techniques of how to create the data lineage graph of individual rows or documents using Change Data Capture (CDC) in source systems.

 

This blog is written exclusively by the OpenCredo team. We do not accept external contributions.

RETURN TO BLOG

SHARE

Twitter LinkedIn Facebook Email

SIMILAR POSTS

Blog