February 1, 2023 | Blog, Data Analysis, Neo4j
As data becomes ubiquitous and deeply interconnected, tracing where who or which system that data comes from – its lineage – will create bigger problems and opportunities for us on the horizon. Watch the recording of James Bowkett talk from NODES 2022 – Neo4j Online Developer Education Summit 202 on ‘Tracing Your Data’s DNA.’
Watch the recording of James Bowkett talk from NODES – Neo4j Online Developer Education Summit 2022
As data becomes ubiquitous and deeply interconnected, tracing where, who or which system that data comes from – its lineage – will create bigger problems and opportunities for us on the horizon:
Furthermore, as our data estate becomes ever-more business-critical, it will be important to be able to secure that data from its source system all the way through the estate, using techniques such as field+row level security (aka cell-based security).
In this talk, James will use live demos and coding examples to explore some techniques of how to create the data lineage graph of individual rows or documents using Change Data Capture (CDC) in source systems. He will store the lineage graph within a graph Database to start with, then explore how other types of database could be used instead. This will create a lineage catalogue that can be queried for all manner of use cases, such as incremental data batch operations, blue-green deployments and “cell-based security” of data fields.
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.