Back to Browse

Apache Spark - Difference between DataSet, DataFrame and RDD

7.1K views
Jul 17, 2020
15:54

In this video, I have explored three sets of APIs—RDDs, DataFrames, and Datasets—available in Apache Spark 2.2 and beyond; why and when you should use each set; outline their performance and optimization benefits; and enumerate scenarios when to use DataFrames and Datasets instead of RDDs.

Download

1 formats

Video Formats

360pmp419.7 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Apache Spark - Difference between DataSet, DataFrame and RDD | NatokHD