On Demand Introductory Videos
Download Now Instant Evaluation
Get Price Quote

Introduction to RandomForests (2004)

RandomForests is a randomized collection of CART trees designed for predictive accuracy in datasets containing many predictors and a small number of records. RandomForests was developed by Leo Breiman and Adele Cutler of the University of California, Berkeley.

Random Forests Download



The SPM Salford Predictive Modeler® software suite is a highly accurate and ultra-fast platform for creating predictive, descriptive, and analytical models from databases of any size, complexity, or organization. The SPM® software suite has automation that accelerates the process of model building by conducting substantial portions of the model exploration and refinement process for the analyst. While the analyst is always in full control, we optionally anticipate the analyst's next best steps and package a complete set of results from alternative modeling strategies for easy review. Do in one day what normally requires a week or more using other systems.

The Salford Predictive Modeler® software suite includes:

This definitive classification tree was developed by world-renowned statisticians, including Doctors Jerome Friedman and Leo Breiman. CART is one of the most well-known data mining algorithms and is designed for both non-technical and technical users.
Ideal for users who prefer results in a form similar to traditional regression while capturing essential non–linearities and interactions.
TreeNet is Salford's most flexible and powerful data mining tool capable of consistently generating extremely accurate models. It has been responsible for the majority of modeling competition awards and demonstrates remarkable performance. The regression classification algorithm typically generates thousands of small decision trees built in a sequential error correcting process to converge a model.
Random Forests®:
Random Forests's features include prediction, clusters and segment discoveries, anomaly tagging detection and multivariate class description. The method was developed by Leo Breiman and Adele Cutler, both of the University of California, Berkeley.

New Components & Features available in version 8.0!

Generalized Path Seeker is Jerry Friedman's approach to regularized regression. This technology offers high-speed lasso for extreme data set configurations with upwards of 100,000 predictors and possibly very few rows. Such sets are commonplace in gene research and text mining. This is both supremely fast and efficient.
RuleLearner is a powerful post–processing technique that selects the most influential subset of nodes, thus reducing model complexity. RuleLearner allows the modeler to take advantage of the increased accuracy of very complicated TreeNet and Random Forests models, while still yielding the simplicity of CART models.


Random Forests Supported Filetypes

Random Forests Supported Filetypes

The RandomForests® data-translation engine supports data conversions for more than 80 file formats, including popular statistical-analysis packages such as SAS® and SPSS®, databases such as Oracle and Informix, and spreadsheets such as Microsoft Excel and Lotus 1-2-3.


Random Forests®

Random Forests

Breiman and Cutler’s Random Forests:
Random Forests is a bagging tool that leverages the power of multiple alternative analyses, randomization strategies, and ensemble learning to produce accurate models, insightful variable importance ranking, and laser-sharp reporting on a record-by-record basis for deep data understanding. Its strengths are spotting outliers and anomalies in data, displaying proximity clusters, predicting future outcomes, identifying important predictors, discovering data patterns, replacing missing values with imputations, and providing insightful graphics
Cluster and Segment:
Much of the insight provided by Random Forests is generated by methods applied after the trees are grown and include new technology for identifying clusters or segments in data as well as new methods for ranking the importance of variables. The method was developed by Leo Breiman and Adele Cutler of the University of California, Berkeley, and is licensed exclusively to Salford Systems. Ongoing research is being undertaken by Salford Systems in collaboration with Professor Adele Cutler, the surviving co-author of Random Forests.
Suited for Wide Datasets:
Random Forests is a collection of many CART trees that are not influenced by each other when constructed. The sum of the predictions made from decision trees determines the overall prediction of the forest. Random Forests is best suited for the analysis of complex data structures embedded in small to moderate data sets containing less than 10,000 rows but potentially millions of columns.




Product Versions

SPM® 8 Product Versions

The best of the best. For the modeler who must have access to leading edge technology available and fastest run times including major advances in ensemble modeling, interaction detection and automation. ULTRA also provides advance access to new features as they become available in frequent upgrades.
For the modeler who needs cutting-edge data mining technology, including extensive automation of workflows typical for experienced data analysts and dozens of extensions to the Salford data mining engines.
A true predictive modeling workbench designed for the professional data miner. Variety of supporting conventional statistical modeling tools, programming language, reporting services, and a modest selection of workflow automation options.
Literally the basics. Salford Systems award winning data mining engines without extensions or automation or surrounding statistical services, programming language, and sophisticated reporting. Designed for small budgets while still delivering our world famous engines



Get In Touch With Us

Contact Us

9685 Via Excelencia, Suite 208, San Diego, CA 92126
Ph: 619-543-8880
Fax: 619-543-8888
info (at) salford-systems (dot) com