7 Theoretical Perspectives 151
7.1 The Equivalent Kernel 151
7.1.1 Some Specific Examples of Equivalent Kernels 153
7.2 Asymptotic Analysis 155
7.2.1 Consistency 155
7.2.2 Equivalence and Orthogonality 157
7.3 Average-case Learning Curves 159
7.4 PAC-Bayesian Analysis 161
7.4.1 The PAC Framework 162
7.4.2 PAC-Bayesian Analysis 163
7.4.3 PAC-Bayesian Analysis of GP Classification 164
7.5 Comparison with Other Supervised Learning Methods 165
7.6 Appendix: Learning Curve for the Ornstein-Uhlenbeck Process 168
7.7 Exercises 169
8 Approximation Methods for Large Datasets 171
8.1 Reduced-rank Approximations of the Gram Matrix 171
8.2 Greedy Approximation 174
8.3 Approximations for GPR with Fixed Hyperparameters 175
8.3.1 Subset of Regressors 175
8.3.2 The Nystr¨om Method 177
8.3.3 Subset of Datapoints 177
8.3.4 Projected Process Approximation 178
8.3.5 Bayesian Committee Machine 180
8.3.6 Iterative Solution of Linear Systems 181
8.3.7 Comparison of Approximate GPR Methods 182
8.4 Approximations for GPC with Fixed Hyperparameters 185
8.5 Approximating the Marginal Likelihood and its Derivatives 185
8.6 Appendix: Equivalence of SR and GPR Using the Nystr¨om Approximate
Kernel 187
8.7 Exercises 187
9 Further Issues and Conclusions 189
9.1 Multiple Outputs 190
9.2 Noise Models with Dependencies 190
9.3 Non-Gaussian Likelihoods 191
9.4 Derivative Observations 191
9.5 Prediction with Uncertain Inputs 192
9.6 Mixtures of Gaussian Processes 192
9.7 Global Optimization 193
9.8 Evaluation of Integrals 193
9.9 Student’s t Process 194
9.10 Invariances 194
9.11 Latent Variable Models 196
9.12 Conclusions and Future Directions 196Appendix A Mathematical Background 199
A.1 Joint, Marginal and Conditional Probability 199
A.2 Gaussian Identities 200
A.3 Matrix Identities 201
A.3.1 Matrix Derivatives 202
A.3.2 Matrix Norms 202
A.4 Cholesky Decomposition 202
A.5 Entropy and Kullback-Leibler Divergence 203
A.6 Limits 204
A.7 Measure and Integration 204
A.7.1 Lp Spaces 205
A.8 Fourier Transforms 205
A.9 Convexity 206
Appendix B Gaussian Markov Processes 207
B.1 Fourier Analysis 208
B.1.1 Sampling and Periodization 209
B.2 Continuous-time Gaussian Markov Processes 211
B.2.1 Continuous-time GMPs on R 211
B.2.2 The Solution of the Corresponding SDE on the Circle 213
B.3 Discrete-time Gaussian Markov Processes 214
B.3.1 Discrete-time GMPs on Z 214
B.3.2 The Solution of the Corresponding Difference Equation on PN 215
B.4 The Relationship Between Discrete-time and Sampled Continuous-time
GMPs 217
B.5 Markov Processes in Higher Dimensions 218
Appendix C Datasets and Code 221
Bibliography 223
Author Index 239
Subject Index 245