Long Description: The main purpose of this guide is to provide a detailed overview of classification modeling in CART. We will address the full set of options available during the model setup as well as guide you through all available output reports and displays. A simple dataset coming from the biomedical application field will be used to illustrate all of the key concepts.
Introduction

Setting up a Classification Model in CART

Modeling Dataset

We start by walking through a simple classification problem taken from the biomedical literature. The topic is low birth weight of newborns. The task is to understand the primary factors leading to a baby being born significantly underweight. The topic is considered important by public health researchers because low birth weight babies can impose significant burdens and costs on the healthcare system. A cutoff of 2500 grams is typically used to define a low birth weight baby.

The following variables are available:

• LOW
- Birth weight less than 2500 grams (coded 1 if <2500, 0 otherwise).
• AGE
- Mother’s age.
• FTV
- Number of first trimester physician visits.
• HT
- History of hypertension (coded 1 if present, 0 otherwise).
• LWD
- Low Mother’s weight at last menstrual period (coded 1 if
• PTD
- Occurrence of pre-term labor (coded 1 if present, 0 otherwise).
• RACE
- Mother’s ethnicity (coded 1, 2 or 3).
• SMOKE
- Smoking during pregnancy (coded 1 if smoked, 0 otherwise).
• UI
- Uterine irritability (coded 1 if present, 0 otherwise).

As you might guess we are going to explore the possibility that characteristics of the mother, including demographics, health status, and the mother’s behavior, might influence the probability of a low birth weight baby.

