By Phone or Online

Access the help you need to use our software from representatives who are knowledgeable in data mining and predictive analytics

  • Banner 201707

    By Phone or Online

    Access the help you need to use our software from representatives who are knwoledgeable in data mining and predictive analytics

Download Now Instant Evaluation
Get Price Quote

What are "intelligent surrogates for missing values"?

CART handles missing values in the database by substituting "surrogate splitters," which are back-up rules that closely mimic the action of primary splitting rules. Suppose that, in a given model, CART splits data according to household income. If a value for income is not available, CART might substitute education level as a good surrogate.
The surrogate splitter contains information that is typically similar to what would be found in the primary splitter. Other products' approaches treat all records with missing values as if the records all had the same unknown value; with that approach all such "missings" are assigned to the same bin. In CART, each record is processed using data specific to that record. This allows records with different data patterns to be handled differently, which results in a better characterization of the data.
By using surrogates to stand in for missing values, CART generates robust and reliable predictive models, even when applied to very large databases with hundreds of variables and many missing values. CART's identification of surrogate predictor variables also provides an effective way to discover low-cost predictive mechanisms. If the best splitting criterion in a tree involves an expensive or difficult-to-obtain measure, a less-expensive surrogate can be considered instead.

[J#363:1602]

Get In Touch With Us

Contact Us

9685 Via Excelencia, Suite 208, San Diego, CA 92126
Ph: 619-543-8880
Fax: 619-543-8888
info (at) salford-systems (dot) com