Concepts

Algorithm Overview - Naive Bayes

Algorithm Description Naïve Bayes is a probabilistic classification algorithm, based on Bayes Theorem. Bayes Theorem suggests that we can find the probability of an event, given the probability another event has occurred. With this algorithm, we must assume that each feature makes an independent and equal contribution to the analysis. This translates to the assumptions that no pair of features are dependent, and each feature equally contributes to the classification. Using Bayes Theorem, th...

Algorithm Overview - Neural Networks

Algorithm Description The terms "Neural Net", “Deep Learning”, “Deep Neural Net”, and other similar terms describe a classical neural network algorithm with at least one hidden layer. The words “Neural Network” came about because the algorithm is loosely patterned off of the human brain, which is a collection of interconnected (network) of neurons. In the brain, a neuron receives an electrical signal and ‘decides’ how much of the signal to pass on to its neighboring neurons. There is a chemic...

Algorithm Overview - CART/CHAID

Algorithm Description CART and CHAID are both Decision Tree machine learning algorithms. Their objective is to find quantitative splits (segments) of the dataset that do a good job of differentiating the dataset with respect to the target variable. These segments are created by iteratively splitting the dataset based on key values of the most most important predictor variables. Most decision tree algorithms differ with respect to how they determine the most important predictors, and the key v...

Algorithm Overview - XGBoost

Algorithm Description Extreme Gradient Boosting (XGBoost) is a decision tree based Machine Learning algorithm, used for classification and regression problems. The Gradient Boosting algorithm builds decision trees sequentially (instead of in parallel and independently, like Random Forest) such that each subsequent tree aims to reduce the errors of the previous tree. Each tree learns from its predecessor and updates the residual errors. Hence, the tree that grows next in the sequence is learni...

Algorithm Overview - Linear and Logistic Regression

Algorithm Description Linear and Logistic Regression are supervised machine learning techniques which investigate the relationship between a dependent variable (target) and independent variable(s) (predictors). A Linear Regression model focuses on predicting a continuous target, while a Logistic Regression model aims to predict a binary target (e.g. 1/0 , True/False, Yes/No). Both techniques can have continuous or discrete predictors. Linear Regression Linear Regression establishes ...

Algorithm Overview - Random Forest

Algorithm Description The Random Forest algorithm is a supervised machine learning technique that uses many individual decision trees to form a "forest” or ensemble. Random Forests are trained using a method called ‘bagging’, which uses randomly sampled subsets of the data to train each decision tree. This method helps reduce variance in the model. The ‘bagging’ method is also applied to the feature space, where only a random subset of features is considered at each split in each decision tre...

Introduction to Predictive Modeling

Intro to Predictive Modeling pdf

LityxIQ Machine Learning Modeling Approach

LityxIQ makes modeling easier and more accessible to a wider audience while also making the modeling process faster and the models more predictive. It utilizes best in class algorithms and approaches, and tightly integrates model building with model implementation for ease of deployment. Predictive models in LityxIQ are built to be successful for business applications such as acquisition, retention, value, risk modeling, supply chain, forecasting, and many others. The focus is on key business me...

What Can You Do in Predict?

The PREDICT solution, within LityxIQ, contains four main links which are available on the left side of your browser window. These are described below in more detail. Models - Use this link to build, manage, and execute predictive models. Here you can create, define, analyze, schedule, compare, approve, and implement many types of models. You can also manage model versioning here. Scoring Jobs - Use this link to create, schedule, and execute scoring jobs. Scoring Catalogs - Use this ...

Building Models for Novice Users

For building predictive models, Predict is a unique platform for novice or business users. Powerful and accurate models can be built, maintained, and implemented with little effort. Technical aspects of model building can be left to the platform. As the novice user practices and gains more knowledge about the modeling process, they then have the option of working with some of the more advanced options and parameters. However, even the novice user needs to have a firm grasp of some key concept...

Types of Models

In Predict, predictive models are created with a business objective in mind. The technical aspects of the model, such as statistical algorithms and options, are optional and will be discussed in a separate article. When creating a new model (see https://support.lityxiq.com/338307-Creating-a-New-Predictive-Model for more information), there are several types to choose from. The options available may differ from those shown below. - Affinity - used to predict an individual's affinity (...

Machine Learning Algorithms

LityxIQ supports a number of machine learning algorithms. Different algorithms tend to perform well for different situations, often depending on the dataset itself. Below is a list of supported algorithms, as well as the types of models supported for each, and a link to documentation for the algorithm that includes an overview of the settings available. Algorithm Continuous-value target Binary target Time-series Documentation Linear Regression x ...

Algorithm Overview - Forecasting Algorithms

Forecasting is used to predict time series data using trends and seasonality factors. Examples include sales data, web traffic data, or social media activity data. The target variable is numeric and stored in time sequence. More information can be found at http://en.wikipedia.org/wiki/Forecasting, and another good resource is the site https://otexts.com/fpp2/. LityxIQ Forecasting Algorithms Holt-Winters - a type of double exponential smoothing where exponential smoothing assigns exponential...

Analyzing Model Scores

A question often asked is how to analyze your data with your newly developed model. This is actually very easy and involves three steps. Step #1 Score the full dataset you just built your model on https://support.lityxiq.com/868916-Model-Scoring Step #2 Join the dataset you just built your model on with your model scores Within Data Manager create New Dataset | View. Incoming data is your development dataset. Joins is your scoring catalog using the primary key selected in step #1 as yo...

C-Statistic and ROC Curves

The C-Statistic, also called Concordance Statistic or C-Stat, is a common metric used analyze performance of binary classification models, and to compare multiple models to one another. Specifically, the C-statistic is computed as the area under the ROC curve. The minimum value of c is 0.0 and the maximum is 1.0. C-values of 0.7 to 0.8 to show acceptable discrimination, values of 0.8 to 0.9 to indicate excellent discrimination, and values of ≥0.9 to show outstanding discrimination. An RO...