In Aboleth we use function composition to compose machine learning models. These models are callable python classes that when called return a TensorFlow computational graph (really a tf.Tensor). We can best demonstrate this with a few examples.
This is a simple demo that draws a random, non linear function from a Gaussian process with a specified kernel and length scale. We then use Aboleth (in Gaussian process approximation mode) to try to learn this function given only a few noisy observations of it. This script also demonstrates how we can divide the data into mini-batches using utilities in the tf.train module, and how we can use tf.train.MonitoredTrainingSession to log the learning progress.
This demo can be used to generate figures like the following:
#! /usr/bin/env python3
"""This demo uses Aboleth for approximate Gaussian process regression."""
import logging
import numpy as np
import bokeh.plotting as bk
import bokeh.palettes as bp
import tensorflow as tf
# from sklearn.gaussian_process.kernels import Matern as kern
from sklearn.gaussian_process.kernels import RBF as kern
import aboleth as ab
from aboleth.likelihoods import Normal
from aboleth.datasets import gp_draws
# Set up a python logger so we can see the output of MonitoredTrainingSession
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# Set up a consistent random seed in Aboleth so we get repeatable, but random
# results
RSEED = 666
ab.set_hyperseed(RSEED)
# Data settings
N = 1000 # Number of training points to generate
Ns = 400 # Number of testing points to generate
kernel = kern(length_scale=0.5) # Kernel to use for making a random GP draw
true_noise = 0.1 # Add noise to the GP draws, to make things a little harder
# Model settings
n_samples = 5 # Number of random samples to get from an Aboleth net
n_pred_samples = 10 # This will give n_samples by n_pred_samples predictions
n_epochs = 200 # how many times to see the data for training
batch_size = 10 # mini batch size for stochastric gradients
config = tf.ConfigProto(device_count={'GPU': 0}) # Use GPU? 0 is no
# Model initialisation
variance = tf.Variable(1.) # Likelihood variance initialisation, and learning
reg = 1. # Initial weight prior variance, this is optimised later
# Random Fourier Features
# lenscale = tf.Variable(1.) # learn the length scale
# kern = ab.RBF(lenscale=ab.pos(lenscale)) # keep the length scale positive
# Variational Fourier Features -- length-scale setting here is the "prior", we
# can choose to optimise this or not
lenscale = 1.
kern = ab.RBFVariational(lenscale=lenscale) # This is VAR-FIXED kernel from
# Cutjar et. al. 2017
# This is how we make the "latent function" of a Gaussian process, here
# n_features controls how many random basis functions we use in the
# approximation. The more of these, the more accurate, but more costly
# computationally. "full" indicates we want a full-covariance matrix Gaussian
# posterior of the model weights. This is optional, but it does greatly improve
Here we demonstrate a slightly different take on Bayesian deep learning. Yarin Gal in his thesis and associate publications demonstrates that we can view regular neural networks with dropout as a form of variational inference with specific prior and posterior distributions on the weights.
In this demo we implement this elegant idea with maximum a-posteriori weight and dropout layers in a classifier (see ab.layers). We leave these layers as stochastic in the prediction step, and draw samples from the network’s predictive distribution, as we would in variational networks.
We test the classifier against a random forest classifier on the breast cancer dataset with 5-fold cross validation, and get quite good and robust performance.
#! /usr/bin/env python3
"""This script demonstrates an alternative way of making a Bayesian Neural Net.
This is based on Yarin Gal's work on interpreting dropout networks as a special
case of Bayesian neural nets, see http://mlg.eng.cam.ac.uk/yarin/blog_2248.html
"""
import tensorflow as tf
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import KFold
from sklearn.metrics import accuracy_score, log_loss
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
import aboleth as ab
FOLDS = 5
RSEED = 100
ab.set_hyperseed(RSEED)
# Optimization
NITER = 20000 # Training iterations per fold
BSIZE = 10 # mini-batch size
CONFIG = tf.ConfigProto(device_count={'GPU': 0}) # Use GPU ?
LSAMPLES = 1 # We're only using 1 dropout "sample" for learning to be more
# like a MAP network
PSAMPLES = 50 # This will give LSAMPLES * PSAMPLES predictions