This video covers 4 popular methods by which unbalanced data can be treated for in a classification dataset. For the sake of simplicity, only binary classification will be considered.
The break-down of this video is as follows:
Introduction 00:00
Why correct for unbalanced data 01:45
Binary classification 03:58
Describing 4 techniques for unbalanced data 04:26
Notebook setup 08:42
Oversampling experiment 13:38
Undersampling experiment 16:31
Synthetic data experiment 19:23
Class weights experiment 23:43
Conclusions 25:32
The best way to keep up-to-date with my video/blog content is to sign up for my monthly Newsletter! Please visit: https://insidelearningmachines.com/newsletter/ to register.
You can download the notebook used here from my GitHub: https://github.com/insidelearningmachines/Blog/blob/main/Notebook%20XXXVIII%204%20techniques%20for%20unbalanced%20data.ipynb
This video is based off of an article on my blog. You can find that blog article here: https://insidelearningmachines.com/unbalanced_data/
The homepage of my blog is: https://insidelearningmachines.com
Other social media includes:
Twitter: https://twitter.com/inside_machines
Facebook: https://www.facebook.com/Inside-Learning-Machines-112215488183517
#machinelearning #datascience #classification #insidelearningmachines