ND4J is an Apache2 Licensed open-sourced scientific computing library for the JVM. It is meant to be used in production environments rather than as a research tool, which means routines are designed to run fast with minimum RAM requirements.
Please search for the latest version on search.maven.org.
As there are many cases where ND4J alone can be used conveniently, let's briefly grasp how to use ND4J before looking into the explanation of DL4J. If you would like to use ND4J alone, once you create a new Maven project, then you can use ND4J by adding the following code to pom.xml:
<properties>
<nd4j.version>0.4-rc3.6</nd4j.version>
</properties>
<dependencies>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-jblas</artifactId>
<version>${nd4j.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-perf</artifactId>
<version>${nd4j.version}</version>
</dependency>
</dependencies>
Here, <nd4j.version> describes the latest version of ND4J, but please check whether it is updated when you actually implement the code. Also, switching from CPU to GPU is easy while working with ND4J. If you have CUDA installed with version 7.0, then what you do is just define artifactId as follows:
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-jcublas-7.0</artifactId>
<version>${nd4j.version}</version>
</dependency>
You can replace the version of <artifactId> depending on your configuration.
Let's look at a simple example of what calculations are possible with ND4J. The type we utilize with ND4J is INDArray, that is, an extended type of Array. We begin by importing the following dependencies:
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;
Then, we define INDArray as follows:
INDArray x = Nd4j.create(new double[]{1, 2, 3, 4, 5, 6}, new int[]{3, 2});
System.out.println(x);
Nd4j.create takes two arguments. The former defines the actual values within INDArray, and the latter defines the shape of the vector (matrix). By running this code, you get the following result:
[[1.00,2.00]
[3.00,4.00]
[5.00,6.00]]
Since INDArray can output its values with System.out.print, it's easy to debug. Calculation with scalar can also be done with ease. Add 1 to x as shown here:
x.add(1);
Then, you will get the following output:
[[2.00,3.00]
[4.00,5.00]
[6.00,7.00]]
Also, the calculation within INDArray can be done easily, as shown in the following example:
INDArray y = Nd4j.create(new double[]{6, 5, 4, 3, 2, 1}, new int[]{3, 2});
Then, basic arithmetic operations can be represented as follows:
x.add(y)
x.sub(y)
x.mul(y)
x.div(y)
These will return the following result:
[[7.00,7.00]
[7.00,7.00]
[7.00,7.00]]
[[-5.00,-3.00]
[-1.00,1.00]
[3.00,5.00]]
[[6.00,10.00]
[12.00,12.00]
[10.00,6.00]]
[[0.17,0.40]
[0.75,1.33]
[2.50,6.00]]
Also, ND4J has destructive arithmetic operators. When you write the x.addi(y) command, x changes its own values so that System.out.println(x); will return the following output:
[[7.00,7.00]
[7.00,7.00]
[7.00,7.00]]
Likewise, subi, muli, and divi are also destructive operators. There are also many other methods that can conveniently perform calculations between vectors or matrices. For more information, you can refer to http://nd4j.org/documentation.html, http://nd4j.org/doc/ and http://nd4j.org/apidocs/.
Let's look at one more example to see how machine learning algorithms can be written with ND4J. We'll implement the easiest example, perceptrons, based on the source code written in Chapter 2, Algorithms for Machine Learning – Preparing for Deep Learning. We set the package name DLWJ.examples.ND4J and the file (class) name Perceptrons.java.
First, let's add these two lines to import from ND4J:
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;
The model has two parameters: num of the input layer and the weight. The former doesn't change from the previous code; however, the latter isn't Array but INDArray:
public int nIn; // dimensions of input data
public INDArray w;
You can see from the constructor that since the weight of the perceptrons is represented as a vector, the number of rows is set to the number of units in the input layer and the number of columns to 1. This definition is written here:
public Perceptrons(int nIn) {
this.nIn = nIn;
w = Nd4j.create(new double[nIn], new int[]{nIn, 1});
}
Then, because we define the model parameter as INDArray, we also define the demo data, training data, and test data as INDArray. You can see these definitions at the beginning of the main method:
INDArray train_X = Nd4j.create(new double[train_N * nIn], new int[]{train_N, nIn}); // input data for training
INDArray train_T = Nd4j.create(new double[train_N], new int[]{train_N, 1}); // output data (label) for training
INDArray test_X = Nd4j.create(new double[test_N * nIn], new int[]{test_N, nIn}); // input data for test
INDArray test_T = Nd4j.create(new double[test_N], new int[]{test_N, 1}); // label of inputs
INDArray predicted_T = Nd4j.create(new double[test_N], new int[]{test_N, 1}); // output data predicted by the model
When we substitute a value into INDArray, we use put. Please be careful that any value we can set with put is only the values of the scalar type:
train_X.put(i, 0, Nd4j.scalar(g1.random()));
train_X.put(i, 1, Nd4j.scalar(g2.random()));
train_T.put(i, Nd4j.scalar(1));
The flow from a model building and training is the same as the previous code:
if (classified_ == train_N) break; // when all data classified correctly
epoch++;
if (epoch > epochs) break;
}
Each piece of training data is given to the train method by getRow(). First, let's see the entire content of the train method:
public int train(INDArray x, INDArray t, double learningRate) {
int classified = 0;
// check if the data is classified correctly
double c = x.mmul(w).getDouble(0) * t.getDouble(0);
// apply steepest descent method if the data is wrongly classified
if (c > 0) {
classified = 1;
} else {
w.addi(x.transpose().mul(t).mul(learningRate));
}
return classified;
}
We first focus our attention on the following code:
// check if the data is classified correctly
double c = x.mmul(w).getDouble(0) * t.getDouble(0);
This is the part that checks whether the data is classified correctly by perceptions, as shown in the following equation:
Implementations with ND4J
You can see from the code that .mmul() is for the multiplication between vectors or matrices. We wrote this part of the calculation in Chapter 2, Algorithms for Machine Learning – Preparing for Deep Learning, as follows:
double c = 0.;
// check if the data is classified correctly
for (int i = 0; i < nIn; i++) {
c += w[i] * x[i] * t;
}
By comparing both codes, you can see that multiplication between vectors or matrices can be written easily with INDArray, and so you can implement the algorithm intuitively just by following the equations.
The equation to update the model parameters is as follows:
w.addi(x.transpose().mul(t).mul(learningRate));
Here, again, you can implement the code like you write a math equation. The equation is represented as follows:
Implementations with ND4J
The last time we implemented this part, we wrote it with a for loop:
for (int i = 0; i < nIn; i++) {
w[i] += learningRate * x[i] * t;
}
Furthermore, the prediction after the training is also the standard forward activation, shown as the following equation:
Implementations with ND4J
Here:
Implementations with ND4J
We can simply define the predict method with just a single line inside, as follows:
public int predict(INDArray x) {
return step(x.mmul(w).getDouble(0));
}
When you run the program, you can see its precision and accuracy, and the recall is the same as we get with the previous code.
Thus, it'll greatly help that you implement the algorithms analogous to mathematical equations. We only implement perceptrons here, but please try other algorithms by yourself.