Back to Browse

Data Analytics Computing: Train Test Split and Overfitting

69 views
Mar 19, 2023
32:28

In this video, we will go over a relevant topic in predictive modeling and machine learning: overfitting, as well as a solution for it, the Train Test Split Dataset used for this video: https://www.kaggle.com/datasets/mirzahasnine/loan-data-set?resource=download Code used in this video: https://github.com/Aurelius2500/Train-Test-Split/blob/main/Train%20Test%20Split%20Computing.py A similar coverage of how we estimate models and then evaluate them on different data can be found in An Introduction to Statistical Learning, Chapter 2: https://www.statlearning.com/ Basics of reading data into Python: https://www.w3schools.com/python/pandas/pandas_csv.asp How to install Anaconda and Spyder on your computer: https://docs.anaconda.com/anaconda/install/index.html Chapters: 0:00 Introduction and Data 5:15 Fitting a Model on All Data 10:00 The (Random) Train Test Split 14:45 Using a predefined Train Test Split 17:16 How the Train Test Split Affects the Model 22:50 Implications for More Complex Models and Overfitting 26:40 Train Test Split for Classification

Download

0 formats

No download links available.

Data Analytics Computing: Train Test Split and Overfitting | NatokHD