Performance Analysis: Variable Importance

What is Variable Importance?

It is the relative importance of each variable in your model’s prediction. The most impactful variable gets a value of 100 and then each subsequent variable is relative to the top one.  If the second variable has a value of 88 then that variable is 88% as important as the top variable or you could say 12% less important. 

How is Variable Importance computed?

It is dependent on the algorithm, and is generally related to how the algorithm’s approach estimates its importance.  E.g., random forest evaluates along the lines of the percentage of trees a variable appears in along with how important it is in the tree.  CART looks at Gini or similar score values.  Linear models are based mostly on univariate p-values which don’t always tell as well as the others the full story since this doesn’t account for interaction term importance.