😱 Faced a REAL Databricks Data Engineering interview question!
In this video, we solve a **classic but tricky scenario-based interview problem**:
👉 Find employees who logged in for **3 consecutive days** using:
✅ Databricks SQL
✅ PySpark (Spark SQL / DataFrame API)
🔥 What you’ll learn:
• How interviewers think while asking consecutive days problems
• Window functions (ROW_NUMBER, DATEDIFF) explained simply
• SQL vs PySpark approach comparison
• Common mistakes candidates make in Databricks interviews
• How to optimize logic for real-world data engineering scenarios
This problem is frequently asked for:
🔹 Databricks Data Engineer
🔹 Azure / AWS Data Engineer
🔹 PySpark Developer
🔹 Big Data & Analytics roles
If you're preparing for **Databricks, Big 4, FAANG, Accenture, Genpact** interviews — this video is a must-watch.
📌 Interview Tip: Logic matters more than syntax!
🔔 Subscribe to **The Data Engineer Edge** for:
• Real interview questions
• Scenario-based SQL & PySpark problems
• Databricks-focused interview prep
• Data Engineering career growth (2026–2035)
#Databricks #DataEngineering #SQLInterview #PySpark #RealInterviewQuestion
💬 Comment **“CONSECUTIVE”** for more content like this !
Query to create table :
from pyspark.sql.functions import col, to_date, row_number, expr
from pyspark.sql.window import Window
data = [
(1, '1/20/2025'),
(1, '1/21/2025'),
(1, '1/22/2025'),
(1, '1/24/2025'),
(2, '1/15/2025'),
(2, '1/16/2025'),
(2, '1/18/2025'),
(3, '1/15/2025'),
(3, '1/16/2025'),
(3, '1/17/2025'),
(3, '1/18/2025')
]
columns = ["emp_id", "login_date"]
df = spark.createDataFrame(data, columns)