Open-Source Spotlight - Great Expectations (Data Quality Platform) - James Campbell
Great Expectations Cofounder James Campbell presents a demo of the open-source data quality platform Great Expectations. 00:00 Introducing James Campbell and Great Expectations 02:13 Demo: Notebook setup 03:00 NYC taxi data example - exploratory analysis flow 04:00 Use out-of-band knowledge to build Expectations 05:15 Use the Mostly parameter to turn row-level Expectations to batch-level 05:30 Build Data Docs from Expectations and Validation Results 06:50 Use a Data Assistant to Suggest Expectations based on previous batches 09:20 Run a Checkpoint to validate new batches of data 12:30 Add validation actions in Checkpoint configurations 13:25 Demo: Onboarding Data Assistant creates a more comprehensive picture of the dataset 16:04 Workflow for using Great Expectations in a pipeline 17:03 Great Expectations accesses data from any source during Expectation Validation 17:50 Expectations on Pandas or SQL backends 19:20 Great Expectations Cloud to support configuration management and users 20:27 The team and contributors behind Great Expectations 21:00 Join the Slack community 22:46 Advice: Work on the things you love Links: - Great Expectations website: https://greatexpectations.io MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
Download
0 formatsNo download links available.