This conceptual video explains key Databricks data engineering concepts focused on getting data into Databricks using Lakeflow Connect, including fully managed connectors (for common SaaS apps like Meta Ads, Salesforce, and ServiceNow, and databases like MySQL, MS SQL Server, and PostgreSQL) and standard connectors (e.g., SFTP and Apache Kafka) that are more code-based. It also covers manually uploading raw files to a Databricks volume before creating downstream objects. The episode reviews table and view types—managed tables, streaming tables, views, and materialized views—and references Spark Declarative Pipelines guidance on when to use streaming tables, materialized views, and temporary views. Finally, it contrasts ETL with Lakeflow Spark Declarative Pipelines versus standard Apache Spark and summarizes Lakeflow components: Connect for ingestion, SDP for transformations, and Jobs for orchestration.
Best practice Lakeflow SDP
https://docs.databricks.com/aws/en/ldp/best-practices
#databricks
00:00 Intro and Roadmap
01:06 Lakeflow Connect Overview
01:52 Manual File Uploads
02:07 Fully Managed Connectors
02:46 Standard Connectors
03:07 Tables and Views Explained
04:42 SDP Dataset Guidance
05:49 ETL Pipelines SDP vs Spark
07:30 Lakeflow Parts Recap
07:48 Wrap Up and Next Videos