Download Now! Free 30 Day Trial of Salford System's Predictive Modeling Suite

Upcoming Tradeshows

  • JSM
    July 28, 2012 - August 02, 2012
    San Diego, CA, Booth TBA
  • KDD
    August 12, 2012 - August 16, 2012
    Beijing, China, Booth TBA
  • Statistical Learning and Data Mining III
    October 01, 2012
    Boston, MA
  • DMA
    October 13, 2012 - October 19, 2012
    Las Vegas, NV
  • INFORMS
    October 14, 2012 - October 16, 2012
    Phoenix, AZ
View full calendar
Home Support FAQs CART What is variable importance?

What is variable importance? Featured

CART automatically produces a predictor ranking (also known as variable importance) based on the contribution predictors make to the construction of the tree. Predictor rankings are strictly relative to a specific tree; change the tree and you might get very different rankings. Importance is determined by playing a role in the tree, either as a main splitter or as a surrogate. CART users have the option of fine tuning the variable importance algorithm.

Variable importance for a particular predictor is the sum across all nodes in the tree of the improvement scores that the predictor has when it acts as a primary or a surrogate (but not as a competitor) splitter. Specifically, for node i, if the predictor appears as the primary splitter, then it has a contribution toward the importance of:

importance_contribution_node_i = improvement

If instead, the predictor appears as the nth surrogate instead of as the primary predictor, the expression is:

importance_contribution_node_i = (p ^ n) * improvement

in which p is the “surrogate improvement weight”: a user-controlled parameter that is equal to 1.0 by default and can be set anywhere between 0 and 1. Thus, you are able to specify that surrogate splits contribute less towards a predictor's improvement than do primary splits. This parameter is controlled with the BOPTIONS IMPORTANCE option.

Linear combination splits do not contribute in any way to variable improvement.

If, in the absence of linear combinations, the improvement weight is greater than 0, and the variable has importance = 0.0, it does not appear in the tree as a primary or surrogate splitter, although it may appear as a competitor.