• Banner 201707

    INTRODUCING

    Fast, highly accurate platform for data mining and predictive analytics

  • Banner 201707

    INTRODUCING

    Fast, highly accurate platform for data mining and predictive analytics

Talk to Minitab
Get Price Quote

Why does the tree change when non-splitting variables are dropped?

If a variable does not enter the tree as a primary node splitter, it may still play a important role in the tree as a surrogate splitter. If you have turned the displaying of surrogate splitters off, you will not see how these variables affect the tree but they will still be used internally by CART when applying the tree to data. The Variable Importance Table produced by CART ranks the variables in the tree by their importance, a statistic measuring how strongly a variable acts as a primary or surrogate splitter.
Suppose a variable enters the tree as the top surrogate splitter in many nodes, but never as the primary splitter. If this variable is removed from the list of potential predictor variables and the tree is rebuilt, it will probably be a very different tree, and certainly will be if there are missing values in the data for the primary node-splitting variables.
Steinberg, Dan and Colla, Phillip. CART—Classification and Regression Trees. San Diego, CA: Salford Systems, 1997.
Another possibility is due to the way CART grows trees. Normally, CART first grows a maximal tree and then tests it either through cross validation or a separate test sample. If a split does not hold up to testing, it is removed from the model. Thus, if a model splits one or more times on a particular variable, but none of these splits hold up to testing, the variable will not appear as a primary splitter in the final model. However, if the variable is dropped, the splits involving that variable in the maximal tree might be replaced by others, which may appear in the final tree.

[J#372:1602]

Tags: Frequently Asked Questions, FAQs, CART, Support, Salford-Systems

  • SPM Version 8 Just Released!

    SPM Version 8 Just Released!

    NEW Salford Predictive Modeler software suite.

    Read more

  • Environmental Forecasting

    Environmental Forecasting

    Forecast the evolution of environmental outcomes using changes in habitat and climate as predictors.
  • Sports Analytics

    Sports Analytics

    "Discover the undisclosed predictors to successful athletic performance using modern decision trees."
  • Targeted Marketing

    Targeted Marketing

    Enabling you to get appropriate prospective customers more efficiently than any other marketing strategies.
  • Text Mining

    Text Mining

    Derive high-quality information from text to improve your understanding of behaviours and patterns.
  • Bioinformatics

    Bioinformatics

    "Increase your probability of solving formal and practical challenges arising from the analysis of biological data."
  • Bioinformatics

    Bioinformatics

    Learn how to make knowledge-driven decisions that can revolutionize your business performance.
  • Financial Services

    Financial Services

    Analyze your spending and financial investments to help influence a profitable future for your company
  • Industrial Optimisation

    Industrial Optimisation

    Overcome retail challenges and achieve new levels of predictive accuracy, profitability and reliability.
  • Music

    Music

    Predict musical score groupings, composers that complement each other and what song listeners prefer to listen to.
  • Retail Analytics

    Retail Analytics

    Make smarter decisions to help manage your business more effectively and efficiently.
  • SPM Version 8 Just Released!

    SPM Version 8 Just Released!

    Salford Systems' applications span every major industry and business function

    Read more

  • Environmental Forecasting

    Environmental Forecasting

    Forecast the evolution of environmental outcomes using changes in habitat and climate as predictors.
  • Sports Analytics

    Sports Analytics

    Discover the undisclosed predictors to successful athletic performance using modern decision trees.
  • Targeted Marketing

    Targeted Marketing

    Enabling you to get appropriate prospective customers more efficiently than any other marketing strategies.
  • Text Mining

    Text Mining

    Derive high-quality information from text to improve your understanding of behaviours and patterns.
  • Bioinformatics

    Bioinformatics

    Increase your probability of solving formal and practical challenges arising from the analysis of biological data.
  • Business

    Business

    Learn how to make knowledge-driven decisions that can revolutionize your business performance.
  • Financial Services

    Financial Services

    Analyze your spending and financial investments to help influence a profitable future for your company
  • Industrial Optimisation

    Industrial Optimisation

    Overcome retail challenges and achieve new levels of predictive accuracy, profitability and reliability.
  • Music

    Music

    Predict musical score groupings, composers that complement each other and what song listeners prefer to listen to.
  • Retail Analytics

    Retail Analytics

    Make smarter decisions to help manage your business more effectively and efficiently.

Get In Touch With Us

Request online support

Ph: 619-543-8880
9685 Via Excelencia, Suite 208, San Diego, CA 92126