Day88: Ridge Regression

I ended up thinking about ridge regression today, and how foggy my recollection is. Therefore, in today’s post I’ll look into ridge regression.

Workflow

Load data

Train simple ridge model

Evaluate model

The Jupyter Notebook for this little project is found here.

Data

The data used is the Digit Recognizer MNIST data set available on Kaggle. The data (from the train file) was split 80% for training and 20% for testing.

Feature: 784
Outputs: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Number of Train samples:     33600
Number of Validation samples: 8400

Ridge regression

In layman’s term, ridge regression is “L2 regularized linear regression” (Jacob Jensen, 2012). Meaning weights are penalized for growing too large.

From what I understood, ridge regression uses L2 which does not cause feature weights to be 0 (whereas using L1 does).

from sklearn.linear_model import RidgeClassifier

# Train and fit classifier
clf = RidgeClassifier(alpha=0.1)
clf.fit(X_train, y_train)

# Predict
clf.predict(X_test)

# Evaluate by finding the proportion of correct predictions
clf.score(X_test, y_test)

Training a simple ridge classifier on the MNIST data, we got a prediction accuracy 0.8550%.