Download Now! Free 30 Day Trial of Salford System's Predictive Modeling Suite

Upcoming Tradeshows

  • Predictive Analytics World
    March 05, 2012 - March 09, 2012
    San Francisco, CA Booth 224
  • Statistical Learning and Data Mining III
    March 15, 2012 - March 16, 2012
    Palo Alto, CA, Booth TBA
  • INFORMS OR
    April 15, 2012 - April 17, 2012
    Huntington Beach, CA Software Workshop April 15, 1-2:45pm Booth 6
  • JSM
    July 28, 2012 - August 02, 2012
    San Diego, CA, Booth TBA
  • KDD
    August 12, 2012 - August 16, 2012
    Beijing, China, Booth TBA
View full calendar
Home Training Courses Salford Predictive Modeler
Salford Predictive Modeler™ Two-Day Training

AGENDA

Two-Day Salford Predictive Modeler Training
Hosted by Salford Systems

It is optional to bring your own laptop, software and data sets.
*If you would like evaluation software during the training you must request it from
This e-mail address is being protected from spambots. You need JavaScript enabled to view it. and have it installed prior to the first day of training.

Day 1

9am – 10am Introduction to Predictive Modeling with Decision Trees Using CART

Discover the power of tree-structured data mining during this popular introductory seminar, geared toward statisticians and IT audiences who are interested in understanding the conceptual basis of decision tree technology -- what it is, why it works, how it has been used, and how it can help you make better business decisions.

  • Decision tree fundamentals
  • Decision tree applications
  • How to build and interpret CART models
10am – 10:15amBreak
10:15am – 11: 15amAn Introduction to Salford Predictive Modeler

Explore SPM's unique modeling automation capabilities while running multiple data sets on both GUI and Non-GUI interfaces, and the advantages and disadvantages to both. We'll introduce the CART component of SPM, and explain:

  • Parts of the display
  • Variable importance
  • Summary reporting
  • Surrogates and competitors
  • How the utility handles missing values
11:15am – 11:30amBreak
11:30am – 12:30pm

Introduction to the Powerful Use of Batteries in SPM

Major Battery Functions:

  • Battery target and handling missing values
  • Importance of the prior probabilities control in CART
  • Battery priors
  • Uses for hotspot detection
12:30pm – 1:30pmLunch
1: 30pm – 2:30pm

Introduction to Multivariate Adaptive Regression Splines (MARS)

Understand tree-based regression using MARS, its advantages and disadvantages, piece-wise constant solutions and how it bridges the evolution of the regression component in CART.

Introduction to the core concepts of MARS:

  • Adaptive Modeling
  • Smooths, splines and knots
  • Basis function
2:30pm – 2:45pmBreak
2:45pm – 3:45pm

MARS in Action

Develop more accurate regression models for problems such as predicting credit card holder balances, insurance claim losses, and customer catalog orders.

Guide to reading the MARS output:

  • Build a MARS model in SPM
  • Understand the MARS interface
  • Control Parameters
  • How MARS handles categorical predictors
  • How MARS handles binary responses
3:45pm – 4pmBreak
4pm – 5pm

Introduction to Ensemble-Based Modeling Techniques

RandomForests®, created by Leo Breiman and Adele Cutler, is based on learning ensembles of CART trees. By judiciously injecting randomness into the tree-building process and then combining hundreds of these trees, RF is able to deliver high performance predictive models and a variety of novel exploratory data analysis results. RF also incorporates new metric free CLUSTER analyses that automatically select the variables used to define each cluster, with potentially different variables defining each cluster.

Day 2

9am – 10am

Introduction to Boosting Using Decision Trees

TreeNet stochastic gradient boosting is Stanford University Professor Jerome Friedman's latest advance in data mining methodology. In TreeNet, classification and regression models are built up gradually through a potentially large collection of small trees, each of which improves on its predecessors through an error-correcting strategy. Although each tree may have only one split, the full model can be extraordinarily accurate. The final model takes the form of a series expansion in which every term is a (small) tree.

TreeNet improves over conventional boosting in that:

  • It is relatively impervious to errors in the target, such as mislabeling
  • It is strongly resistant to overfitting
  • It generalized well to future data
10:00am – 10:15amBreak
10:15am – 11: 15am

TreeNet in Action

Explore SPM's unique modeling automation capabilities while running multiple data sets on both GUI and Non-GUI interfaces, and the advantages and disadvantages to both. We'll introduce the CART component of SPM, and explain:

  • Building models in SPM
  • Setting control parameters
  • Interpreting output
  • Variable importance
  • Introduction to battery shaving in SPM using TreeNet
11:15am – 11:30amBreak
11:30am – 12:30pm

Interpreting TreeNet Models and Interaction Detection

Interaction detection is the detection and reporting component of TreeNet using Interaction Control Language (ICL).

You will understand:

  • How to shape the structure of interactions
  • How to impose different interactions in TreeNet models
  • Dependency plots and how they are utilized
12:30pm – 1:30pmLunch
1: 30pm – 2:30pm

Modern Approaches To Regularized Regression

Generalized Path Seeker (GPS) is the most recent advance in regularized regression. This technology offers high-speed LASSO-style regression for extreme data set configurations with upwards of 100,000 predictors and possibly very few rows. Such data sets are commonplace in gene research and it is both supremely fast and efficient.

  • Application using examples in data sets
  • How GPS is implemented in SPM
  • Command line operation of SPM
  • Comments on parallel processing
2:30pm – 2:45pmBreak
2:45pm – 3:45pm

Linking Engines

Explore how to:

  • Combine TreeNet’s power of transformation and variable selection with GPS
  • Identify the most influential trees in GPS with ISLE (Importance Samples Learning Ensembles)
  • Use Rulefit to identify the most influential nodes and rules in a TreeNet model
3:45pm – 4pmBreak
4pm - 5pm

Loose Ends and Application

Q&A with the experts for further discussion and apply SPM to your own data sets.