Can We Obtain Dependency Plots for Single CART® Trees?

The short answer is YES such plots can be generated. Historically, we concluded that such graphs would normally not be that interesting as they would frequently be single step functions reflecting the fact that individual variables often appear only once or twice in a tree. Also, such graphs would not properly reflect the effect of a varible across most of its range of values. Thus, as of SPM® 7.0 CART® does not offer such plots. However, we can see what such plots would look like by using TreeNet® to grow a one-tree model. To do this, just set up a normal model, choose the TreeNet analysis method, and set the number of trees to be grown to 1 (see green arrow below).

What if I cannot apply a tree to new data?

You wish to apply your results to new data, but CASE will not accept the data.

What is CART®?

CART® is an acronym for Classification and Regression Trees, a decision-tree procedure introduced in 1984 by world-renowned UC Berkeley and Stanford statisticians, Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. Their landmark work created the modern field of sophisticated, mathematically- and theoretically-founded decision trees. The CART methodology solves a number of performance, accuracy, and operational problems that still plague many other current decision-tree methods. CART's innovations include:

How are nominal (ordered) predictors and rank related?

One of the strengths of CART® is that, for ordered predictors, the only information CART uses are the rank orders of the data – not the actual value of the data. In other words, if you replace a predictor with its rank order, the CART tree will be unchanged.

What if there are too many levels in a categorical predictor?

CART® will only search over all possible subsets of a categorical predictor for a limited number of levels. Beyond a threshold set by computational feasibility, CART will simply reject the problem. You can control this limit with the BOPTION NCLASSES = m command, but be aware that for m larger than 15, computation times increase dramatically.

What makes Salford Systems' CART® the only "true" CART?

Salford Systems' CART® is the only decision tree based on the original code of Breiman, Friedman, Olshen, and Stone. Because the code is proprietary, CART is the only true implementation of this classification-and-regression-tree methodology. In addition, the procedure has been substantially enhanced with new features and capabilities in exclusive collaboration with CART's creators. While some other decision-tree products claim to implement selected features of this technology, they are unable to reproduce genuine CART trees and lack key performance and accuracy components. Further, CART's creators continue to collaborate with Salford Systems to refine CART and to develop the next generation of data-mining tools.

What is cross validation?

Cross-validation is a method for estimating what the error rate of a sub-tree (of the maximal tree) would be if you had test data. Regardless of what value you set for V-fold cross validation, CART grows the same maximal tree. The monograph provides evidence that using a V of 10-20 gives better results than using a smaller number, but each number could result in a slightly different error estimate. The optimal tree — which is derived from the maximal tree by pruning — could differ from one V to another because each cross-validation run will come up with slightly different estimates of the error rates of sub-trees and thus might differ in which tree was actually best.

What is variable importance?

CART® automatically produces a predictor ranking (also known as variable importance) based on the contribution predictors make to the construction of the tree. Predictor rankings are strictly relative to a specific tree; change the tree and you might get very different rankings. Importance is determined by playing a role in the tree, either as a main splitter or as a surrogate. CART users have the option of fine tuning the variable importance algorithm.

Get In Touch With Us

Request online support

Ph: 619-543-8880
9685 Via Excelencia, Suite 208, San Diego, CA 92126