This tutorial explores Autoloader in Databricks. I demonstrate how to build a simple Autoloader pipeline. Furthermore, I explain the different schema evolution options available with the autoloader.
Chapters:
00:00- Introduction
03:06-Overview of Autoloader
10:19-Demo: Building Your First Autoloader Pipeline
14:03-Schema Evolution overview
19:15-Exploring 'failOnNewColumns' option
20:23-Exploring 'rescue' option
22:52-Schema Inference overview
23:45-Inferring data types
25:03-Using schema hints
25:34-File discovery and notifications options
27:49-Using directory mode
28:10-File notification mode
32:45-Other Autoloader configurations
Please take a look at this video to learn the basics of Spark Structured Streaming: https://youtu.be/hpjsWfPjJyI
Please subscribe: https://www.youtube.com/channel/UC8d958MxE2t1dr27QNqoOhA
Download demo/exercise notebooks from here:
https://github.com/fazizov/youtube/blob/main/Data%20engineering%20with%20Databricks/Autoloader/Autoloader.dbc