AutoDiscovery of Predictors in SPM
Autodiscovery leverages the stability advantages of multiple trees to rank variables for importance and thus select a subset of predictors for modeling. In SPM® v8.2 and earlier Autodiscovery runs a very simple training data only TreeNet model growing out to 200 trees. The variable importance ranking generated from this model is then used to reduce the list of all available predictors down to the top performing predictors in this background model. Autodiscovery is fast and easy, as there are no control parameters to set, but it is just a mechanism for quickly testing whether a substantial refinement in the number of predictors can improve model performance.
In most serious modeling projects we would supplement Autodiscovery with more intensive variable selection mechanisms such as we have built into AUTOMATE SHAVING, where the model, rank, select, and model again cycle is repeated possibly a very large number of times.