Author:Varmuza, Kurt/ Filzmoser, Peter
Publisher:CRC Press
Contents:
Chapter 1 Introduction
1.1 Chemoinformatics–Chemometrics–Statistics
1.2 This Book
1.3 Historical Remarks about Chemometrics
1.4 Bibliography
1.5 Starting Examples
1.5.1 Univariate versus Bivariate Classification
1.5.2 Nitrogen Content of Cereals Computed from NIR Data
1.5.3 Elemental Composition of Archaeological Glasses
1.6 Univariate Statistics—A Reminder
1.6.1 Empirical Distributions
1.6.2 Theoretical Distributions
1.6.3 Central Value
1.6.4 Spread
1.6.5 Statistical Tests
References
Chapter 2 Multivariate Data
2.1 Definitions
2.2 Basic Preprocessing
2.2.1 Data Transformation
2.2.2 Centering and Scaling
2.2.3 Normalization
2.2.4 Transformations for Compositional Data
2.3 Covariance and Correlation
2.3.1 Overview
2.3.2 Estimating Covariance and Correlation
2.4 Distances and Similarities
2.5 Multivariate Outlier Identification
2.6 Linear Latent Variables
2.6.1 Overview
2.6.2 Projection and Mapping
2.6.3 Example
2.7 Summary
References
Chapter 3 Principal Component Analysis
3.1 Concepts
3.2 Number of PCA Components
3.3 Centering and Scaling
3.4 Outliers and Data Distribution
3.5 Robust PCA
3.6 Algorithms for PCA
3.6.1 Mathematics of PCA
3.6.2 Jacobi Rotation
3.6.3 Singular Value Decomposition
3.6.4 NIPALS
3.7 Evaluation and Diagnostics
3.7.1 Cross Validation for Determination of the Number
of Principal Components
3.7.2 Explained Variance for Each Variable
3.7.3 Diagnostic Plots
3.8 Complementary Methods for Exploratory Data Analysis
3.8.1 Factor Analysis
3.8.2 Cluster Analysis and Dendrogram
3.8.3 Kohonen Mapping
3.8.4 Sammon’s Nonlinear Mapping
3.8.5 Multiway PCA
3.9 Examples
3.9.1 Tissue Samples from Human Mummies
and Fatty Acid Concentrations
3.9.2 Polycyclic Aromatic Hydrocarbons in Aerosol
3.10 Summary
References
Chapter 4 Calibration
4.1 Concepts
4.2 Performance of Regression Models
4.2.1 Overview
4.2.2 Overfitting and Underfitting
4.2.3 Performance Criteria
4.2.4 Criteria for Models with Different Numbers of Variables
4.2.5 Cross Validation
4.2.6 Bootstrap
4.3 Ordinary Least-Squares Regression
4.3.1 Simple OLS
4.3.2 Multiple OLS
4.3.2.1 Confidence Intervals and Statistical Tests in OLS
4.3.2.2 Hat Matrix and Full Cross Validation in OLS
4.3.3 Multivariate OLS
4.4 Robust Regression
4.4.1 Overview
4.4.2 Regression Diagnostics
4.4.3 Practical Hints
4.5 Variable Selection
4.5.1 Overview
4.5.2 Univariate and Bivariate Selection Methods
4.5.3 Stepwise Selection Methods
4.5.4 Best-Subset Regression
4.5.5 Variable Selection Based on PCA or PLS Models
4.5.6 Genetic Algorithms
4.5.7 Cluster Analysis of Variables
4.5.8 Example
4.6 Principal Component Regression
4.6.1 Overview
4.6.2 Number of PCA Components
4.7 Partial Least-Squares Regression
4.7.1 Overview
4.7.2 Mathematical Aspects
4.7.3 Kernel Algorithm for PLS
4.7.4 NIPALS Algorithm for PLS
4.7.5 SIMPLS Algorithm for PLS
4.7.6 Other Algorithms for PLS
4.7.7 Robust PLS
4.8 Related Methods
4.8.1 Canonical Correlation Analysis
4.8.2 Ridge and Lasso Regression
4.8.3 Nonlinear Regression
4.8.3.1 Basis Expansions
4.8.3.2 Kernel Methods
4.8.3.3 Regression Trees
4.8.3.4 Artificial Neural Networks
4.9 Examples
4.9.1 GC Retention Indices of Polycyclic
Aromatic Compounds
4.9.1.1 Principal Component Regression
4.9.1.2 Partial Least-Squares Regression
4.9.1.3 Robust PLS
4.9.1.4 Ridge Regression
4.9.1.5 Lasso Regression
4.9.1.6 Stepwise Regression
4.9.1.7 Summary
4.9.2 Cereal Data
4.10 Summary
References
Chapter 5 Classification
5.1 Concepts
5.2 Linear Classification Methods
5.2.1 Linear Discriminant Analysis
5.2.1.1 Bayes Discriminant Analysis
5.2.1.2 Fisher Discriminant Analysis
5.2.1.3 Example
5.2.2 Linear Regression for Discriminant Analysis
5.2.2.1 Binary Classification
5.2.2.2 Multicategory Classification with OLS
5.2.2.3 Multicategory Classification with PLS
5.2.3 Logistic Regression
5.3 Kernel and Prototype Methods
5.3.1 SIMCA
5.3.2 Gaussian Mixture Models
5.3.3 k-NN Classification
5.4 Classification Trees
5.5 Artificial Neural Networks
5.6 Support Vector Machine
5.7 Evaluation
5.7.1 Principles and Misclassification Error
5.7.2 Predictive Ability
5.7.3 Confidence in Classification Answers
5.8 Examples
5.8.1 Origin of Glass Samples
5.8.1.1 Linear Discriminant Analysis
5.8.1.2 Logistic Regression
5.8.1.3 Gaussian Mixture Models
5.8.1.4 k-NN Methods
5.8.1.5 Classification Trees
5.8.1.6 Artificial Neural Networks
5.8.1.7 Support Vector Machines
5.8.1.8 Overall Comparison
5.8.2 Recognition of Chemical Substructures from Mass Spectra
5.9 Summary
References
Chapter 6 Cluster Analysis
6.1 Concepts
6.2 Distance and Similarity Measures
6.3 Partitioning Methods
6.4 Hierarchical Clustering Methods
6.5 Fuzzy Clustering
6.6 Model-Based Clustering
6.7 Cluster Validity and Clustering Tendency Measures
6.8 Examples
6.8.1 Chemotaxonomy of Plants
6.8.2 Glass Samples
6.9 Summary
References
Chapter 7 Preprocessing
7.1 Concepts
7.2 Smoothing and Differentiation
7.3 Multiplicative Signal Correction
7.4 Mass Spectral Features
7.4.1 Logarithmic Intensity Ratios
7.4.2 Averaged Intensities of Mass Intervals
7.4.3 Intensities Normalized to Local Intensity Sum
7.4.4 Modulo-14 Summation
7.4.5 Autocorrelation
7.4.6 Spectra Type
7.4.7 Example
References
Appendix 1 Symbols and Abbreviations
Appendix 2 Matrix Algebra
A.2.1 Definitions
A.2.2 Addition and Subtraction of Matrices
A.2.3 Multiplication of Vectors
A.2.4 Multiplication of Matrices
A.2.5 Matrix Inversion
A.2.6 Eigenvectors
A.2.7 Singular Value Decomposition
References
Appendix 3 Introduction to R
A.3.1 General Information on R
A.3.2 Installing R
A.3.3 Starting R
A.3.4 Working Directory
A.3.5 Loading and Saving Data
A.3.6 Important R Functions
A.3.7 Operators and Basic Functions
Mathematical and Logical Operators, Comparison
Special Elements
Mathematical Functions
Matrix Manipulation
Statistical Functions
A.3.8 Data Types
Missing Values
A.3.9 Data Structures
A.3.10 Selection and Extraction from Data Objects
Examples for Creating Vectors
Examples for Selecting Elements from a Vector or Factor
Examples for Selecting Elements from a Matrix, Array,
or Data Frame
Examples for Selecting Elements from a List..
A.3.11 Generating and Saving Graphics
Functions Relevant for Graphics
Relevant Plot Parameters
Statistical Graphics
Saving Graphic Output
References