- Automatic predictor selection from any number of candidates
- The analyst does not need to do any variable selection or data reduction.
- The best predictors are automatically identified.
- Ability to handle data without preprocessing
- Data do not need to be rescaled, transformed, or modified.
- resistant to outliers
- automatically handles missing values
- Resistance to over training
- Numerous trees are generated based on two forms of randomization.
- Growing a large number of RandomForests trees does not create a risk of overfitting.
- Each tree is an independent, random experiment.
- Self-testing using “out-of-bag” data
- Self-testing is based on an extension of cross-validation.
- Self-tests provide highly reliable assessments of the model.
- Cluster identification
- can be used to generate tree-based clusters
- Predictor variables defining clusters are chosen automatically.
- Visualization
- RandomForests offers graphics that yield new insights into data.

