Day34: Getting the variable importance

Note: The data used in today’s posting is from the extracted MLP parameter weights of yesterday.

The Jupyter Notebook for this little project is found here.

Inspecting the ANN model parameters

After loading the model parameters, we begin to look at (1) how many parameters are there and (2) what is the structure of these parameters.

Number of params: 4
-------------------
(784, 100)   # Weights for input to hidden layer
(100,)       # Bias weights to hidden layer
(100, 10)    # Weights for hidden to output layer
(10,)        # Bias weights to output layer

Visualizing parameter weights between layers

We next plotted the parameter weights of each layer. Unsurprisingly, the weights are all over the place.

TOP: weights of input to hidden layer;
BOTTOM: weights of hidden to output layer.

Visualizing the parameter weights across input features

When we visualize the parameter weights to the hidden layer by input feature, we find that some input features are more variable in terms of parameter weights than others.

Garson’s algorithm to calculate variable importance

Using Garson’s algorithm to calculate the feature importance for each of the different output classes, we get the following figure…

… which shows the feature importance as being identical across the output classes. When we consider the math, this property makes sense:

Getting the important features

Finally, we are able to sort by the variable importance for the important features.

	relative_importance
69	0.001629
79	0.001550
502	0.001537
135	0.001526
530	0.001520
248	0.001511
417	0.001510
678	0.001506
401	0.001504
734	0.001495