Optimizing Regression Predictors with RFECV Random Feature Elimination with Cross Validation
RFECV starts with all features, and then recursively eliminates the least relevant features until a specified number of features is reached.
RFECV performs Random Feature Elimination with corss-validations.
To find parameters, we use:
RFECV.n_features_: The optimal number of features.
RFECV.support_: A Boolean array of selected fatures. This makes it easy to filter by only the selected features. features = df[:, RFECTV.support_]
y_predicted = RFECV.predict(X): Use model to predict values.
Cross Valdiation: we train a model on one subset, and then test on another subset. We often use K-cross-fold validation. We partition the data into k equal subsets, train the model on k-1 subset, and then test on the remaining subset. We repeat this k times. This helps to reduce the likelihood of overfit.
We can provide RFECV with a scoring model. We provide this in the scoring attribute. scikit-learn.org lists all of the possible scoring models.
We need to scale the data to properly preform an RFECV analysis. We can use StandardScaler for this.
Download
0 formats
No download links available.
Optimizing Regression Predictors with RFECV Random Feature Elimination with Cross Validation | NatokHD