In this week's #TidyTuesday video, I go over the Palmer Penguins dataset. I then briefly describe each #TidyModels packages and its main functions. I use the Rsample package to create a train and test set as well as a validation set using k-folds cross-validation. I also preprocess the data using the Recipes package as well as use K-nn to impute missing data. I then create two models: an XGBoost model and a K-nearest neighbor model using the Parsnip package. After that, I use the Dials package to create a grid of parameters for both models and then train the models using the Tune package. Finally, I use the Workflow package to train the models and evaluate the models using the Yardstick package. After modeling, I create an R Shiny App using shinydashboard and deploy the previously created model for real-time predictions.
Connect with me on LinkedIn: https://www.linkedin.com/in/andrew-couch/
Q&A Submission Form: https://forms.gle/6EzU4GCR9VnJx8gg7
Code for this video: https://github.com/andrew-couch/Tidy-Tuesday/tree/master/Season%201/Apps/TidyTuesdayPenguinApp
TidyTuesday: https://github.com/rfordatascience/tidytuesday