Back to Browse

Use RandomForest to Select Features Predictors in Regression

244 views
Jul 6, 2025
14:26

RandomForest We can use RandomForest to optimize predictors in a regression analysis. We want to optimize predictors, because too many features can lead to missing data, more records are needed for accuracy, there is potential of overfit, and increased variability. Too few predcitors can lead to underfit, and can increase error. RandomForest is one of several models we'll look at. We'll also look at RFECV, exhaustive search, forward elimination, backward selection RandomForest uses a set of decision trees to predict the outcome. Each tree has access to different subset of data. The most popular outcome, or the average for regression, is the selected model. RandomForest is an ensemble model, as it is a collection of output from other models. In this case, we're using decision trees. Good news! Data do not need to be transformed for RandomForest analysis. For example, we can use RandomForest in our IRIS dataset, blueberry dataset, and our election dataset. RandomForestClassifier is used for categorical modeling. RandomForestRegresser is used for continuous modeling. RandomForest requires that we use a train:test split. With RandomFiorest, we can look at the importance of each feature, and graph them. Next, we can eliminate features that we don't want to consider, including those with cross correlation.

Download

0 formats

No download links available.

Use RandomForest to Select Features Predictors in Regression | NatokHD