Back to Browse

Data Set Encapsulation for Data Lake (Data Mesh)

165 views
Feb 5, 2021
17:00

In this video I talk about how you should consider structuring your data lake. Often data lake structure is described as layers (Bronze, Silver, Gold or Raw, Standardised, Modelled, Curated) but I find it more useful to consider data sets as encapsulated things which stand alone. Just because you have two raw data sets there is no reason they would have the same requirements so don't bind yourself to a structure which will limit choice without good reason. This encapsulation is particularly usefull when doing DataOps, Agile and CI/CD with your data lake platform. The ideas discussed here align well with the data mesh concept explained at https://martinfowler.com/articles/data-monolith-to-mesh.html 0:00 - Introduction 1:29 - Why encapsulation? 6:16 - Datasets 6:37 - Data Contracts 8:13 - Performance 8:46 - Availability 9:27 - Compliance 10:29 - Lifecycle Management 11:12 - Data Transformations 12:52 - Access Control 14:12 - Transitional Datasets and layers 15:15 - Recap on why we encapsulate 16:47 - Wrap-up For all of my other demos, go to https://davedoesdemos.com or go straight to the GitHub page at https://github.com/davedoesdemos/DemoIndex/blob/master/README.md. Also please subscribe to the channel to make sure the latest demos show up in your playlist!

Download

0 formats

No download links available.

Data Set Encapsulation for Data Lake (Data Mesh) | NatokHD