Today is a research day.
As a segue beyond my MSc thesis work, I recently started working on a new mini-project to use deep learning for classification. There are a number of ways to build neural networks and because I am familiar with python and have access to a GPU machine, I intend to use Theano/ Lasagne for building networks.
A couple libraries for machine learning
Here is a short list of a couple of machine learning libraries I’ve compiled from Popular Deep Learning Libraries (Jason Brownlee, 2016):
- Caffe: Deep learning framework made with expression, speed, and modularity in mind – convolutional neural networks & image problems
- CNTK: A unified deep-learning toolkit (by Microsoft Research)
- Deeplearning4j: Open-Source, Distributed, Deep Learning Library for the JVM (for Java)
- Keras: Deep Learning library for Theano and TensorFlow
- Lasagne: Lightweight library to build and train neural networks in Theano
- TensorFlow: An open-source software library for Machine Intelligence (by Google)
- Theano: Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently
- Torch: Framework with wide support for machine learning algorithms that puts GPUs first (written in Lua) – reinforcement learning
Theano - basics
Code taken from craffel@github’s Theano Tutorial.
- Symbolic variables
foo = T.scalar('foo')
- Symbolic functions
# compile code for computing values of bar given values of foo
f = theano.function([foo], bar)
- theano.tensor types include vectors, matrices, and tensors
x = T.vector('x')
subtract = T.matrix('subtract')
- Shared variables (values are shared across functions & can be get and set)
shared_var = theano.shared(np.array([[1, 2], [3, 4]], dtype=theano.config.floatX))
shared_var.set_value(np.array([[3, 4], [2, 1]], dtype=theano.config.floatX))
shared_var.get_value()
- Updates
function_2 = theano.function([subtract], shared_var, updates={shared_var: shared_var - subtract})
- Gradients
bar_grad = T.grad(bar, foo)
bar_grad.eval({foo: 10})
- Debugging - normally get cryptic message without line info
theano.config.compute_test_value = 'warn'
# Setting the tag.test_value attribute gives the variable its test value
A.tag.test_value = np.random.random((3, 4)).astype(theano.config.floatX)
B.tag.test_value = np.random.random((5, 6)).astype(theano.config.floatX)
# Now, we get an error when we compute C which points us to the correct line!
C = T.dot(A, B)
theano.config.compute_test_value = 'off'
# To see nan errors (but will trade off cost of speed)
divide = theano.function([num, den], num/den, mode='DebugMode')
Lasagne - basics
Code taken from craffel@github’s Lasagne Tutorial.
- Layers
# Define input layer
l_in = lasagne.layers.InputLayer(shape=X.shape)
# Define hidden layer (arg: input layer, layer's output dim, nonlinearity)
l_hidden = lasagne.layers.DenseLayer(
l_in, num_units=10, nonlinearity=lasagne.nonlinearities.tanh)
# Define output layer
l_output = lasagne.layers.DenseLayer(
l_hidden, num_units=N_CLASSES, nonlinearity=lasagne.nonlinearities.softmax)
# Compute output given input
net_output = lasagne.layers.get_output(l_output)
- Objectives - to decide when their neural network is ready to use
true_output = T.ivector('true_output')
loss = T.mean(lasagne.objectives.categorical_crossentropy(net_output, true_output))
- Updates
# Get all params
all_params = lasagne.layers.get_all_params(l_output)
# Update
updates = lasagne.updates.sgd(loss, all_params, learning_rate=1)
# train & get predictions
train = theano.function([l_in.input_var, true_output], loss, updates=updates)
get_output = theano.function([l_in.input_var], net_output)
y_predicted = np.argmax(get_output(X), axis=1)