Data Engineering #mockinterview | Myntra | Part 2

Name: Data Engineering #mockinterview | Myntra | Part 2
Uploaded: Premiered Jan 12, 2024
Duration: 5096 s

The Big Data Show119K subscribers

17.7K views

Premiered Jan 12, 2024

1:24:56

Data Engineering Mock Interview - Part 2 This is the next part of the data engineering mock interview between Kuldeep Pal and Vipul, a senior software engineer at Myntra. The interview is over 2 hours long and is released in multiple parts. Please check the previous video to find the previous part of this interview. The interview covers a wide range of topics related to data engineering, including designing and implementing scalable, high-performance batch-processing architectures and working with popular data processing frameworks like Kafka, Apache Spark, Airflow, and AWS. Whether starting your career in data engineering or looking to enhance your skills, this interview is a valuable resource to gain insights into the real-time data processing and engineering field. If you're interested in booking a mock interview, you can visit the provided URL. 🔅 To book a Mock interview - https://topmate.io/ankur_ranjan/15155 You can also follow the interviewee on LinkedIn using the provided links. 🔅 Kuldeep (Interviewer) - https://www.linkedin.com/in/kuldeep27396/ 🔅 Vipul (Interviewee) - https://www.linkedin.com/in/vipul-singhal-21a831125/ The interview covers various topics, including the architecture of #Spark, the difference between RDD, Dataframe, and dataset, spark optimization, shuffling in Spark, and more. It also includes examples of wide and narrow transformation and the reasons behind having GroupByKey, ReduceByKey, and SortByKey. The interview ends with a scenario-based question. This interview is a must-watch if you're interested in data engineering, big data, or a career switch. Don't miss out on the secrets of success in this exciting and dynamic field! Do watch the next video for the remaining content. 𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮: 🔅 LinkedIn - https://www.linkedin.com/in/thebigdatashow 🔅 Instagram - https://www.instagram.com/ranjan_anku/ Chapters: 00:00 - DSA Problem 1 11:07 - DSA Problem 2 19:16 - SQL Problem 1 21:40 - Data Modeling and OLTP V/S OLAP. What Databases you have used under OLTP and OLAP? 24:20 - Design WhatsApp / Instagram from the scratch 48:07 - Kafka Utilization in the Project 48:56 - There are lots of message queues in the market such as Rabbit MQ, Kinesis, Kafka and so on. So on what basis did you choose Kafka for your event? 50:48 - Architecture of Kafka 54:19 - Retention Mechanism in Kafka and how many types we can do retention? 55:44 - Role of Zookeeper in Kafka 57:14 - Can Spark use Zookeeper or not? 57:50 - In Spark, we have resource management and memory management. How does it get handled? 59:06 - Can we use Kubernetes as a Resource manager? 59:19 - SQL Problem 2 1:11:10 - Deduplicate the data without deleting any data. #interview #dataengineering #bigdata #apachespark #careerswitch #job #mockinterview

Download

0 formats

No download links available.