Day26: Theano & Lasagne

Posted by csiu on March 22, 2017 | with: 100daysofcode, Machine Learning

Today is a research day.

As a segue beyond my MSc thesis work, I recently started working on a new mini-project to use deep learning for classification. There are a number of ways to build neural networks and because I am familiar with python and have access to a GPU machine, I intend to use Theano/ Lasagne for building networks.

A couple libraries for machine learning

Here is a short list of a couple of machine learning libraries I’ve compiled from Popular Deep Learning Libraries (Jason Brownlee, 2016):

  • Caffe: Deep learning framework made with expression, speed, and modularity in mind – convolutional neural networks & image problems
  • CNTK: A unified deep-learning toolkit (by Microsoft Research)
  • Deeplearning4j: Open-Source, Distributed, Deep Learning Library for the JVM (for Java)
  • Keras: Deep Learning library for Theano and TensorFlow
  • Lasagne: Lightweight library to build and train neural networks in Theano
  • TensorFlow: An open-source software library for Machine Intelligence (by Google)
  • Theano: Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently
  • Torch: Framework with wide support for machine learning algorithms that puts GPUs first (written in Lua) – reinforcement learning

Theano - basics

Code taken from craffel@github’s Theano Tutorial.

  • Symbolic variables
foo = T.scalar('foo')
  • Symbolic functions
# compile code for computing values of bar given values of foo
f = theano.function([foo], bar)
  • theano.tensor types include vectors, matrices, and tensors
x        = T.vector('x')
subtract = T.matrix('subtract')
  • Shared variables (values are shared across functions & can be get and set)
shared_var = theano.shared(np.array([[1, 2], [3, 4]], dtype=theano.config.floatX))

shared_var.set_value(np.array([[3, 4], [2, 1]], dtype=theano.config.floatX))
shared_var.get_value()
  • Updates
function_2 = theano.function([subtract], shared_var, updates={shared_var: shared_var - subtract})
  • Gradients
bar_grad = T.grad(bar, foo)

bar_grad.eval({foo: 10})
  • Debugging - normally get cryptic message without line info
theano.config.compute_test_value = 'warn'
# Setting the tag.test_value attribute gives the variable its test value
A.tag.test_value = np.random.random((3, 4)).astype(theano.config.floatX)
B.tag.test_value = np.random.random((5, 6)).astype(theano.config.floatX)
# Now, we get an error when we compute C which points us to the correct line!
C = T.dot(A, B)

theano.config.compute_test_value = 'off'
# To see nan errors (but will trade off cost of speed)
divide = theano.function([num, den], num/den, mode='DebugMode')

Lasagne - basics

Code taken from craffel@github’s Lasagne Tutorial.

  • Layers
# Define input layer
l_in = lasagne.layers.InputLayer(shape=X.shape)

# Define hidden layer (arg: input layer, layer's output dim, nonlinearity)
l_hidden = lasagne.layers.DenseLayer(
    l_in, num_units=10, nonlinearity=lasagne.nonlinearities.tanh)
# Define output layer
l_output = lasagne.layers.DenseLayer(
    l_hidden, num_units=N_CLASSES, nonlinearity=lasagne.nonlinearities.softmax)

# Compute output given input
net_output = lasagne.layers.get_output(l_output)
  • Objectives - to decide when their neural network is ready to use
true_output = T.ivector('true_output')
loss = T.mean(lasagne.objectives.categorical_crossentropy(net_output, true_output))
  • Updates
# Get all params
all_params = lasagne.layers.get_all_params(l_output)

# Update
updates = lasagne.updates.sgd(loss, all_params, learning_rate=1)

# train & get predictions
train = theano.function([l_in.input_var, true_output], loss, updates=updates)
get_output = theano.function([l_in.input_var], net_output)
y_predicted = np.argmax(get_output(X), axis=1)