By Phone or Online

Access the help you need to use our software from representatives who are knowledgeable in data mining and predictive analytics

  • Banner 201707

    By Phone or Online

    Access the help you need to use our software from representatives who are knwoledgeable in data mining and predictive analytics

Download Now Instant Evaluation
Get Price Quote
107Frequently Asked Questions for Random Forests
RandomForests® is a bagging tool that leverages the power of multiple alternative analyses, randomization strategies, and ensemble learning to produce accurate models, insightful variable importance ranking, and laser–sharp reporting on a record–by–record basis for deep data understanding. Its strengths are spotting outliers and anomalies in data, displaying proximity clusters, predicting future outcomes, identifying important predictors, discovering data patterns, replacing missing values with imputations, and providing insightful graphics.
Much of the insight provided by RandomForests is generated by methods applied after the trees are grown and include new technology for identifying clusters or segments in data as well as new methods for ranking the importance of variables. The method was developed by Leo Breiman and Adele Cutler of the University of California, Berkeley, and is licensed exclusively to Salford Systems. Ongoing research is being undertaken by Salford Systems in collaboration with Professor Adele Cutler, the surviving co–author of RandomForests.
RandomForests is a collection of many CART trees that are not influenced by each other when constructed. The sum of the predictions made from decision trees determines the overall prediction of the forest. This algorithm is best suited for the analysis of complex data structures embedded in small to moderate data sets containing less than 10,000 rows but potentially millions of columns.

Scoring RandomForests models

Applying Models to New Data Occasionally users ask us how to make use of a model they have just built, and specifically, how to generate predictions from model. In this note we will discuss RandomForests models although the general ideas are relevant for any SPM generated model.

Continue Reading

What is RandomForests®?

RandomForests represents a newly-developed data analysis tool for data mining and predictive modeling. It generates and combines decision trees into predictive models and displays data patterns with a high degree of accuracy. The method was developed by Leo Breiman and Adele Cutler of University of California, Berkeley, and is licensed exclusively to Salford Systems.

Continue Reading

Quick Overview of Unsupervised Learning in Salford SPM

The SPM Salford Predictive Modeler software suite offers several tools for clustering and segmentation including CART, RandomForests, and a classical statistical module CLUSTER. In this article we illustrate the use of these tools with the well known Boston Housing data set (pertaining to 1970s housing prices and neighborhood characteristics in the greater Boston area).

Continue Reading

How does RandomForests work?

RandomForests is a collection of many CART® trees that are not influenced by each other when constructed. The sum of the predictions made from decision trees determines the overall prediction of the forest. Two forms of randomization occur in RandomForests, one by trees and one by node. At the tree level, randomization takes place via observations. At the node level, randomization occurs by using a randomly-selected subset of predictors. Each tree is grown to a maximal size and left unpruned. This process is repeated until a user-defined number of trees is created, a collection called a random forest. Once this is created, the predictions for each tree are used in a "voting" process. The overall prediction is determined by voting for classification and by averaging for regression.

Continue Reading

What are RandomForests’ strengths?

RandomForests specializes in classification and regression problems. Its strengths are spotting outliers and anomalies in data, displaying proximity clusters, predicting future outcomes, identifying important predictors, discovering data patterns, replacing missing values with imputations, and providing insightful graphics. Additionally, it can provide clustering and density estimations.

Continue Reading

  • 1
  • 2

Get In Touch With Us

Contact Us

9685 Via Excelencia, Suite 208, San Diego, CA 92126
Ph: 619-543-8880
Fax: 619-543-8888
info (at) salford-systems (dot) com