On Demand Introductory Videos
Download Now Instant Evaluation
Get Price Quote

What is the systat dataset format?

CART and MARS continue to read data stored in the legacy SYSTAT format, a binary (i.e., not human-readable) format widely used by statisticians and researchers using the SYSTAT statistical programs. Relative to comma-separated-text and some other binary formats, the legacy SYSTAT format is quite restrictive (limited variable name lengths, limited lengths of character data). We do not recommend that you use it. However, for our clients that do need to work with this format, we provide the following C and Fortran programs that illustrate how legacy SYSTAT datasets are structured. Originally, legacy SYSTAT format was written and read with Fortran code. Thus, because the format must accommodate the record segmentation and padding typical of Fortran I/O, the C version handles these issues explicitly.

Continue Reading

What if there are too many levels in a categorical predictor?

CART will only search over all possible subsets of a categorical predictor for a limited number of levels. Beyond a threshold set by computational feasibility, CART will simply reject the problem. You can control this limit with the BOPTION NCLASSES = m command, but be aware that for m larger than 15, computation times increase dramatically.

Continue Reading

What makes Salford Systems' CART the only "true" CART?

Salford Systems' CART is the only decision tree based on the original code of Breiman, Friedman, Olshen, and Stone. Because the code is proprietary, CART is the only true implementation of this classification-and-regression-tree methodology. In addition, the procedure has been substantially enhanced with new features and capabilities in exclusive collaboration with CART's creators. While some other decision-tree products claim to implement selected features of this technology, they are unable to reproduce genuine CART trees and lack key performance and accuracy components. Further, CART's creators continue to collaborate with Salford Systems to refine CART and to develop the next generation of data-mining tools.

Continue Reading

Can We Obtain Dependency Plots for Single CART Trees?

The short answer is YES such plots can be generated. Historically, we concluded that such graphs would normally not be that interesting as they would frequently be single step functions reflecting the fact that individual variables often appear only once or twice in a tree. Also, such graphs would not properly reflect the effect of a varible across most of its range of values. Thus, as of SPM 7.0 CART does not offer such plots. However, we can see what such plots would look like by using TreeNet to grow a one-tree model. To do this, just set up a normal model, choose the TreeNet analysis method, and set the number of trees to be grown to 1 (see green arrow below).

Continue Reading

What is CART?

CART is an acronym for Classification and Regression Trees, a decision-tree procedure introduced in 1984 by world-renowned UC Berkeley and Stanford statisticians, Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. Their landmark work created the modern field of sophisticated, mathematically- and theoretically-founded decision trees. The CART methodology solves a number of performance, accuracy, and operational problems that still plague many other current decision-tree methods. CART's innovations include:

Continue Reading

Get In Touch With Us

Contact Us

9685 Via Excelencia, Suite 208, San Diego, CA 92126
Ph: 619-543-8880
Fax: 619-543-8888
info (at) salford-systems (dot) com