Day26: Theano & Lasagne

Today is a research day.

As a segue beyond my MSc thesis work, I recently started working on a new mini-project to use deep learning for classification. There are a number of ways to build neural networks and because I am familiar with python and have access to a GPU machine, I intend to use Theano/ Lasagne for building networks.

A couple libraries for machine learning

Here is a short list of a couple of machine learning libraries I’ve compiled from Popular Deep Learning Libraries (Jason Brownlee, 2016):

Caffe: Deep learning framework made with expression, speed, and modularity in mind – convolutional neural networks & image problems
CNTK: A unified deep-learning toolkit (by Microsoft Research)
Deeplearning4j: Open-Source, Distributed, Deep Learning Library for the JVM (for Java)
Keras: Deep Learning library for Theano and TensorFlow
Lasagne: Lightweight library to build and train neural networks in Theano
TensorFlow: An open-source software library for Machine Intelligence (by Google)
Theano: Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently
Torch: Framework with wide support for machine learning algorithms that puts GPUs first (written in Lua) – reinforcement learning

Theano - basics

Code taken from craffel@github’s Theano Tutorial.

Symbolic variables

foo = T.scalar('foo')

Symbolic functions

# compile code for computing values of bar given values of foo
f = theano.function([foo], bar)

theano.tensor types include vectors, matrices, and tensors

x        = T.vector('x')
subtract = T.matrix('subtract')

Shared variables (values are shared across functions & can be get and set)

shared_var = theano.shared(np.array([[1, 2], [3, 4]], dtype=theano.config.floatX))

shared_var.set_value(np.array([[3, 4], [2, 1]], dtype=theano.config.floatX))
shared_var.get_value()

Updates

function_2 = theano.function([subtract], shared_var, updates={shared_var: shared_var - subtract})

Gradients

bar_grad = T.grad(bar, foo)

bar_grad.eval({foo: 10})

Debugging - normally get cryptic message without line info

theano.config.compute_test_value = 'warn'
# Setting the tag.test_value attribute gives the variable its test value
A.tag.test_value = np.random.random((3, 4)).astype(theano.config.floatX)
B.tag.test_value = np.random.random((5, 6)).astype(theano.config.floatX)
# Now, we get an error when we compute C which points us to the correct line!
C = T.dot(A, B)

theano.config.compute_test_value = 'off'

# To see nan errors (but will trade off cost of speed)
divide = theano.function([num, den], num/den, mode='DebugMode')

Lasagne - basics

Code taken from craffel@github’s Lasagne Tutorial.

Layers

# Define input layer
l_in = lasagne.layers.InputLayer(shape=X.shape)

# Define hidden layer (arg: input layer, layer's output dim, nonlinearity)
l_hidden = lasagne.layers.DenseLayer(
    l_in, num_units=10, nonlinearity=lasagne.nonlinearities.tanh)
# Define output layer
l_output = lasagne.layers.DenseLayer(
    l_hidden, num_units=N_CLASSES, nonlinearity=lasagne.nonlinearities.softmax)

# Compute output given input
net_output = lasagne.layers.get_output(l_output)

Objectives - to decide when their neural network is ready to use

true_output = T.ivector('true_output')
loss = T.mean(lasagne.objectives.categorical_crossentropy(net_output, true_output))

Updates

# Get all params
all_params = lasagne.layers.get_all_params(l_output)

# Update
updates = lasagne.updates.sgd(loss, all_params, learning_rate=1)

# train & get predictions
train = theano.function([l_in.input_var, true_output], loss, updates=updates)
get_output = theano.function([l_in.input_var], net_output)
y_predicted = np.argmax(get_output(X), axis=1)