Day97: Looking deep

Posted by csiu on June 1, 2017 | with: 100daysofcode, Kaggle, Machine Learning

I saw this tweet today and wanted to see how deep learning compares with the decision trees in predicting Kaggle House Prices.

In this tweet, “Don’t use deep learning your data isn’t that big”, SimplyStats (2017) says:

“When your dataset isn’t that big, doing something simpler is often both more interpretable and it works just as well due to potential overfitting.”

“For low training set sample sizes it looks like the simpler method (just picking the top 10 and using a linear model) slightly outperforms the more complicated methods. As the sample size increases, we see the more complicated method catch up and have comparable test set accuracy.”

The Jupyter Notebook for this little project is found here.

Non-deep learning

In my previous posts, I predicted house prices using various decision trees with comparable accuracy.

Approach Day Accuracy when “Margin of Error” is $5000
Regression Tree Day93 ~0.15
Bagging Day94 ~0.15
Random Forest & Extremely Randomized Tree Day95 ~0.25
Boosting Day96 ~0.11

Note: The models have not been optimized.

Deep learning

Using deep learning, specified in Scikit-learn by MLPRegressor (for regression) or MLPClassifier (for classification), we find:

from sklearn.neural_network import MLPRegressor


mlp = MLPRegressor(alpha=0.001, max_iter=1000,
                   activation="relu", solver="lbfgs",
                   random_state=1)
mlp = mlp.fit(X_train, y_train)

There are many parameter to tune, here we tuned 4 parameters.
(see the Jupyter Notebook for run settings)

  • Indeed predictions between the deep and non-deep methods are comparable
  • changing the alpha values does not change the accuracy much
  • 1000+ epochs seem to give steady accuracies
  • “relu” and “identity” activation does comparably better than “tanh” and “logistic”
  • “sgd” solver does comparably bad