Data Engineering Mock Interview | Myntra | Part 1

Name: Data Engineering Mock Interview | Myntra | Part 1
Uploaded: Premiered Jan 11, 2024
Duration: 2045 s

The Big Data Show119K subscribers

7.6K views

Premiered Jan 11, 2024

34:05

Data Engineering Mock Interview - Part 1 This is a summary of a data engineering mock interview between Kuldeep Pal and Vipul, a senior software engineer at Myntra. The interview is over 2 hours long and is released in multiple parts. The interview covers a wide range of topics related to data engineering, including designing and implementing scalable, high-performance batch-processing architectures and working with popular data processing frameworks like Kafka, Apache Spark, Airflow, and AWS. Whether you're starting your career in data engineering or looking to enhance your skills, this interview is a valuable resource to gain insights into the real-time data processing and engineering field. If you're interested in booking a mock interview, you can visit the provided URL. 🔅 To book a Mock interview - https://topmate.io/ankur_ranjan/15155 You can also follow the interviewee on LinkedIn using the provided links. 🔅 Kuldeep (Interviewer) - https://www.linkedin.com/in/kuldeep27396/ 🔅 Vipul (Interviewee) - https://www.linkedin.com/in/vipul-singhal-21a831125/ The interview covers various topics, including the architecture of #Spark, the difference between RDD, Dataframe, and dataset, spark optimization, shuffling in Spark, and more. It also includes examples of wide and narrow transformation and the reasons behind having GroupByKey, ReduceByKey, and SortByKey. The interview ends with a scenario-based question. If you're interested in data engineering, big data, or a career switch, this interview is a must-watch. Don't miss out on the secrets of success in this exciting and dynamic field! Do watch the next video for the remaining content. 𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮: 🔅 LinkedIn - https://www.linkedin.com/in/thebigdatashow 🔅 Instagram - https://www.instagram.com/ranjan_anku/ Chapters: 00:00 - Introduction 01:45 - Architecture of Spark 03:35 - How Spark is an in-memory computing Engine? 06:15 - What is the difference between RDD, Dataframe and dataset and why do we use them? 07:28 - What are cache and persist and what is the difference between them? 09:30 - Spark Optimization 12:23 - Shuffling in Spark 14:18 - Examples of Wide and Narrow Transformation 15:00 - Why do we have GroupByKey, ReduceByKey, SortByKey? 15:50 - Spark Scenario Based Question - 1 30:59 - Trade-off between Spark SQL and DataFrame 32:41 - DSA Question - 1 #interview #dataengineering #bigdata #apachespark #careerswitch #job #mockinterview

Download

0 formats

No download links available.