Learning Deep Learning Frameworks
Torch/Lua
Learning Torch can be split into two tasks: learning Lua, and then understanding the Torch framework, specifically the nn package. Most people will find that learning Lua will take the majority of the time, as nn is nicely organized and easy to use.
If you already are comfortable with programming languages, then this 15 minute tutorial is good. Alternatively, this other 15 minute tutorial is a bit more terse but rather comprehensive. This will cover the basics. Beyond that, you need to understand how to work with data, which is less well covered. The simplecsv module can simplify I/O.
The actual data format that the optimzer needs is a table object with an attached size method. Each element of this table is itself a table with two elements: input and corresponding output. So this can be considered a row-major matrix representation of the data. To use the provided StochasticGradient optimizer, the data must be constructed this way, as shown in ex_fun_approx.lua. It is up to you to reserve some data for testing.
From a practical perspective, you don’t need to know much about Torch itself. It’s probably more efficient to familiarize yourself with the nn package first. I spend most of my time in this documentation. At some later point, it might be worthwhile learning how Torch itself works, in which case their github repo is flush with documentation and examples. I haven’t needed to look elsewhere.
Keras/Theano
If Theano is like Torch, then Keras is like the nn package. Unless you need to descend into the bits, it’s probably best to stay high-level. Unlike nn, there are alternatives to Keras for Theano, which I won’t cover. Like Torch, Keras comes pre-installed in the Docker image provided in the deep_learning_ex repository. The best way to get started is to read the Keras documentation, which includes a working example of a simple neural network.
As with nn, the trick is understanding the framework’s interface, particularly around what expectations it has for the data. Keras essentially expects a 4-tuple of (input training, input testing, output training, output testing). Their built-in datasets all return data organized like this (actually two pairs representing input and output).
Conclusion
Deep learning doesn’t need to be hard to learn. By following the prescribed workflow, using the provided Docker image, and streamlining your learning of deep learning frameworks to the essentials, you can get up to speed quickly.