RandomForests
Overview
Random Forests, Leo Breiman’s latest data mining technology, is based on learning ensembles of CART trees. By judiciously injecting randomness into the tree building process and then combining hundreds of these trees, RF is able to deliver high performance predictive models and a variety of novel exploratory data analysis results. RF also incorporates new metric free CLUSTER analyses that automatically select the variables used to define each cluster, with potentially different variables defining each cluster.
Content and instructional methods
Attendees will see examples of analysis of real world data. PowerPoint slides and live modeling runs will facilitate the learning process.
Course Outline:
- The RandomForests Algorithm
- Key Innovations
- RF versus CART
- Class Weights
- Randomness in Split Selection
- Measuring Variable Importance
- Proximity Measure
- Scaling Coordinates
- Outlier Detection
- Missing Value Imputation

