The Data & Variables tab is used to specify the dataset and variables which will be the basis for your predictive model. The steps to use this tab are described below.
Dataset - Select the dataset you wish to use to build the model. The drop-down box will show you a list of all datasets to which you have access, organized by dataset library. The icon allows you to browse the selected dataset (see Step 2 of https://support.lityxiq.com/352376-Browse-a-Dataset for how to use the dataset browsing window).
Select the target field for the model. This is the variable in the dataset for which you wish to make predictions. For example, for a response model, you would choose the variable that indicates the customer's response or non-response indicator from a prior campaign. Note that the label for this entry ("Response Indicator Field" in the image above) will be different depending on the model type. Also note that you will only see variables listed in this drop-down that make sense for the type of model you are building.
Churn Field/Churn Field Value - These selections will be available depending on the type of model being built, and will be labeled differently as well. Select the variable to be used as the target for this model. For classification models, the "Value" option will also appear. In this case, use it to select the value of the target variable that is the main value of interest. For example, a response indicator variable in your dataset likely contains the value "1" or "Y" for responders and "0" or "N" for non-responders. To have the response model predict the likelihood of response (the "1's"), select the value "1". Generally, this drop-down will show all of the different values available for the selected target variable.
Predictor Variables - Select the variables to be used as candidate predictor variables for the model. The variables selected here may or may not become part of the final model that gets constructed. The list will show all variables in the dataset. Put a check next to the variables to be used. LityxIQ's machine learning algorithms will perform automated variable reduction techniques to identify only the most important of these variables for predicting the target variable.
Forced Predictor Variables - For certain models and algorithms, you can select variables to be "forced" into the model. This means that they will be a part of the final constructed model regardless of any other statistical or algorithmic considerations. This selection will not always be available, and it will only apply to certain algorithms.
If you are building a forecasting model, you will see additional settings:
- Multivariate TS Predictors - If you are using the VAR algorithm, use this dropdown to select additional correlated forecast variables.
- Data Ordering Variable - Select the variable by which the dataset is ordered, so that the correct sequence of the forecast variable is clear.
- Period of First Observation - Set the period which is represented by the first observation in the time series. Generally, this will be a year value, but it can be any integer value.
- Subperiod of First Observation - Set the subperiod from which the first observation was taken. For example, if you have monthly data and the first observation was collected in March 1979, the subperiod would be 3.
- Series Frequency - Set the total number of subperiods per period. For example, for monthly data use 12, and for weekly data use 52.