Algorithm Description
CART and CHAID are both Decision Tree machine learning algorithms. Their objective is to find quantitative splits (segments) of the dataset that do a good job of differentiating the dataset with respect to the target variable. These segments are created by iteratively splitting the dataset based on key values of the most most important predictor variables. Most decision tree algorithms differ with respect to how they determine the most important predictors, and the key values on which to split the dataset.
CART – Classification and Regression Trees
CART is a binary decision tree algorithm that can be used for classification or regression modeling problems. Creating this type of tree involves cycling through the input variables (‘root’ or ‘parent’ node) and choosing a split point on each variable. Each parent node is split into two child nodes, and this continues until a tree is constructed. The ‘leaf nodes’ of the tree contain the value of the dependent variable (y), which is used to define the prediction. Each data point traverses through the tree until a prediction is made. The specific variable split points are chosen using a greedy algorithm to minimize a cost function, and tree construction ends using a predefined stopping criterion.
CHAID – Chi-square Automated Interaction Detection
CHAID is a decision tree algorithm that determines splitting based on statistical tests. Since this is decision tree, the algorithm again cycles through the predictors to determine the appropriate category splits. This is done by using either a Chi Square Test (categorical response) or an F-Test (continuous response) to find splits that most “explain” the response variable. Using a pre-specified significance level (P-Value), if the test shows that the split variable and response are independent, the algorithm stops the tree growth. Otherwise, the split is created, and the next best split is searched for.
Additional Links
https://sefiks.com/2018/08/27/a-step-by-step-cart-decision-tree-example/
https://sefiks.com/2020/03/18/a-step-by-step-chaid-decision-tree-example/
https://www.listendata.com/2015/03/difference-between-chaid-and-cart.html
http://www.bzst.com/2006/10/classification-trees-cart-vs-chaid.html
https://machinelearningmastery.com/classification-and-regression-trees-for-machine-learning/
Lityx IQ Parameters
CART
Minimum Observations Needed to Split - The minimum number of observations allowed at a node for the node to be further split into sub-nodes.
Minimum Observations in a Child Node - The minimum number of observations allowed in a resulting child node of the potential split.
Maximum Tree Depth - Maximum number of levels for the tree
Splitting Criterion - The cost function used to determine variable split points. Gini is intended for
Surrogate Splits - How missing values are handled by the tree. Surrogates will save information about secondary splits that are used in the case of missing data at a node.
Maximum No. of Model terms - The maximum number of terms used during the variable selection process. Larger values may have a longer processing time, but smaller values may miss important variables.
CHAID
Minimum Observations Needed to Split - Same as for CART
Minimum Observations in a Child Node - Same as for CART
Maximum Tree Depth - Same as for CART
Maximum P-Value Allowed to Make Split - The largest p-value allowed to make a split at a node. If no predictors have a p-value smaller than this setting, no split is made at the node. The larger you set the value, the larger the tree may get.
Maximum No. of Model terms - Same as for CART