The Selection & Transformation tab appears when editing the settings for a modeling algorithm. The options available on the tab will depend upon the algorithm you are editing. The various options are explained below. Many of the options have related advanced settings which can be found in the Advanced Settings tab.
- Transform Dependent Variable - Select the transformation to apply to the dependent variable. The default is no transformation. You can select multiple, in which case each will be tried. This is only available for numeric prediction models.
- Bin Categorical Predictors - If this option is turned on, Predict will search for optimal ways to bin (i.e., combine the categories) for categorical predictors. Turn it on by checking Yes, or turn it off by checking No. You can check both Yes and No, in which case Predict will run it both ways as separate iterations so that you can compare the results.
- Bin Numeric Predictors - If this option is turned on, Predict will search for optimal ways to bin (i.e., group together ranges of values) for numeric predictors. Turn it on by checking Yes, or turn it off by checking No. You can check both Yes and No, in which case Predict will run it both ways as separate iterations so that you can compare the results.
- Normalize Numeric Predictors - If this option is turned on, PREDICT will search for optimal ways to normalize (i.e., scale down the values) for numeric predictors. The normalization may involve a standard normal transformation, logging, or other techniques deemed optimal. Turn it on by checking Yes, or turn it off by checking No. You can check both Yes and No, in which case Predict will run it both ways as separate iterations so that you can compare the results.
- Search for Interaction Terms - If set to Yes, LityxIQ will perform extensive automatic searches for interaction terms to place into the model. This is available for linear and logistic regression-based models. Other algorithms naturally search for complex interactions between variables.
- Highest Order Polynomial Degree - Set the highest polynomial order for which to search. The default is 2. If any value higher than 1 is selected, LityxIQ will automatically search for significant polynomial terms for numeric variables. This is only available for linear or logistic regression-based models.
- Autocorrelation Search - If set to Yes, this option will enable Predict to search for pairs of predictor variables that have strong inter-correlations. When it finds such correlations, it will remove one of the pair from further consideration in the modeling process. Setting this option to No will turn off the autocorrelation search and allow all variables chosen as candidate predictors to proceed through further modeling steps. Generally, it is a good idea to allow for autocorrelation search as it provides greater efficiency for the overall process and sacrifices very little model performance.
- Variable Search Direction - For certain iterative algorithms, this setting determines how variable selection procedures will be applied. The Forward setting (default) will continue to add variables to the model until certain internal criteria are met. The Forward and Backward setting will do the same, but also allow for variables to later be removed from the model if it is deemed beneficial. The Forward setting generally requires less processing time and leads to strong models, but in some cases Forward and Backward may provide a better result, although it will often lead to longer model runs.