Data Engineering #mockinterview | Myntra | Part 2
Data Engineering Mock Interview - Part 2 This is the next part of the data engineering mock interview between Kuldeep Pal and Vipul, a senior software engineer at Myntra. The interview is over 2 hours long and is released in multiple parts. Please check the previous video to find the previous part of this interview. The interview covers a wide range of topics related to data engineering, including designing and implementing scalable, high-performance batch-processing architectures and working with popular data processing frameworks like Kafka, Apache Spark, Airflow, and AWS. Whether starting your career in data engineering or looking to enhance your skills, this interview is a valuable resource to gain insights into the real-time data processing and engineering field. If you're interested in booking a mock interview, you can visit the provided URL. ๐ To book a Mock interview - https://topmate.io/ankur_ranjan/15155 You can also follow the interviewee on LinkedIn using the provided links. ๐ Kuldeep (Interviewer) - https://www.linkedin.com/in/kuldeep27396/ ๐ Vipul (Interviewee) - https://www.linkedin.com/in/vipul-singhal-21a831125/ The interview covers various topics, including the architecture of #Spark, the difference between RDD, Dataframe, and dataset, spark optimization, shuffling in Spark, and more. It also includes examples of wide and narrow transformation and the reasons behind having GroupByKey, ReduceByKey, and SortByKey. The interview ends with a scenario-based question. This interview is a must-watch if you're interested in data engineering, big data, or a career switch. Don't miss out on the secrets of success in this exciting and dynamic field! Do watch the next video for the remaining content. ๐๐ผ๐ถ๐ป ๐บ๐ฒ ๐ผ๐ป ๐ฆ๐ผ๐ฐ๐ถ๐ฎ๐น ๐ ๐ฒ๐ฑ๐ถ๐ฎ: ๐ LinkedIn - https://www.linkedin.com/in/thebigdatashow ๐ Instagram - https://www.instagram.com/ranjan_anku/ Chapters: 00:00 - DSA Problem 1 11:07 - DSA Problem 2 19:16 - SQL Problem 1 21:40 - Data Modeling and OLTP V/S OLAP. What Databases you have used under OLTP and OLAP? 24:20 - Design WhatsApp / Instagram from the scratch 48:07 - Kafka Utilization in the Project 48:56 - There are lots of message queues in the market such as Rabbit MQ, Kinesis, Kafka and so on. So on what basis did you choose Kafka for your event? 50:48 - Architecture of Kafka 54:19 - Retention Mechanism in Kafka and how many types we can do retention? 55:44 - Role of Zookeeper in Kafka 57:14 - Can Spark use Zookeeper or not? 57:50 - In Spark, we have resource management and memory management. How does it get handled? 59:06 - Can we use Kubernetes as a Resource manager? 59:19 - SQL Problem 2 1:11:10 - Deduplicate the data without deleting any data. #interview #dataengineering #bigdata #apachespark #careerswitch #job #mockinterview
Download
0 formatsNo download links available.