Data Engineering Mock Interview | Myntra | Part 1
Data Engineering Mock Interview - Part 1 This is a summary of a data engineering mock interview between Kuldeep Pal and Vipul, a senior software engineer at Myntra. The interview is over 2 hours long and is released in multiple parts. The interview covers a wide range of topics related to data engineering, including designing and implementing scalable, high-performance batch-processing architectures and working with popular data processing frameworks like Kafka, Apache Spark, Airflow, and AWS. Whether you're starting your career in data engineering or looking to enhance your skills, this interview is a valuable resource to gain insights into the real-time data processing and engineering field. If you're interested in booking a mock interview, you can visit the provided URL. ๐ To book a Mock interview - https://topmate.io/ankur_ranjan/15155 You can also follow the interviewee on LinkedIn using the provided links. ๐ Kuldeep (Interviewer) - https://www.linkedin.com/in/kuldeep27396/ ๐ Vipul (Interviewee) - https://www.linkedin.com/in/vipul-singhal-21a831125/ The interview covers various topics, including the architecture of #Spark, the difference between RDD, Dataframe, and dataset, spark optimization, shuffling in Spark, and more. It also includes examples of wide and narrow transformation and the reasons behind having GroupByKey, ReduceByKey, and SortByKey. The interview ends with a scenario-based question. If you're interested in data engineering, big data, or a career switch, this interview is a must-watch. Don't miss out on the secrets of success in this exciting and dynamic field! Do watch the next video for the remaining content. ๐๐ผ๐ถ๐ป ๐บ๐ฒ ๐ผ๐ป ๐ฆ๐ผ๐ฐ๐ถ๐ฎ๐น ๐ ๐ฒ๐ฑ๐ถ๐ฎ: ๐ LinkedIn - https://www.linkedin.com/in/thebigdatashow ๐ Instagram - https://www.instagram.com/ranjan_anku/ Chapters: 00:00 - Introduction 01:45 - Architecture of Spark 03:35 - How Spark is an in-memory computing Engine? 06:15 - What is the difference between RDD, Dataframe and dataset and why do we use them? 07:28 - What are cache and persist and what is the difference between them? 09:30 - Spark Optimization 12:23 - Shuffling in Spark 14:18 - Examples of Wide and Narrow Transformation 15:00 - Why do we have GroupByKey, ReduceByKey, SortByKey? 15:50 - Spark Scenario Based Question - 1 30:59 - Trade-off between Spark SQL and DataFrame 32:41 - DSA Question - 1 #interview #dataengineering #bigdata #apachespark #careerswitch #job #mockinterview
Download
0 formatsNo download links available.