Understanding machine learning algorithms can be challenging. Here we focus on one specific aspect: how different algorithms deal with interaction effects, i. e. when the effect of one variable depends on the level of another variable.
We look at simple linear regression, show how to model interaction effects in that framework, then move on to more flexible methods: firstly GAMs (Generalized Additive Model), then some non-parametric models: KNN (K nearest neighbors), a single decision tree (using rpart), and random forest / bagging.
Interaction effects are shown graphically. We use the well-known Boston dataset from the MASS package in Base R and focus on simple models using only two predictors: lstat (lower status of the population in percent) and whether a suburb is located at the Charles River (chas).
R code is shown using the caret package for cross validation. Visualizations are created using ggplot2.
Contact me, e. g. to discuss (online) R workshops / trainings / webinars:
LinkedIn: https://www.linkedin.com/in/wolfriepl/
Twitter: https://twitter.com/StatistikInDD
Xing: https://www.xing.com/profile/Wolf_Riepl
Facebook: https://www.facebook.com/statistikdresden/
https://statistik-dresden.de/kontakt
R Workshops: https://statistik-dresden.de/r-schulungen
Blog (German, translate option): https://statistik-dresden.de/statistik-blog
Playlist: Music chart history
https://www.youtube.com/playlist?list=PL4ZUlAlk7QidRlzHEiHX09htXMAbxTpjW