An Introduction to Statistics with Python by Thomas Haslwanter
This book is designed to give you all (or at least most of) the tools that you
will need for statistical data analysis. I attempt to provide the background you need
to understand what you are doing. I do not prove any theorems and do not apply
mathematics unless necessary. For all tests, a working Python program is provided.
In principle, you just have to define your problem, select the corresponding program,
and adapt it to your needs. This should allow you to get going quickly, even if you
have little Python experience. This is also the reason why I have not provided the
software as one single Python package. I expect that you will have to tailor each
program to your specific setup (data format, plot labels, return values, etc.).
This book is organized into three parts:
Part I gives an introduction to Python: how to set it up, simple programs to get
started, and tips how to avoid some common mistakes. It also shows how to read
data from different sources into Python and how to visualize statistical data.
Part II provides an introduction to statistical analysis. How to design a study,
and how best to analyze data, probability distributions, and an overview of the
most important hypothesis tests. Even though modern statistics is firmly based
in statistical modeling, hypothesis tests still seem to dominate the life sciences.
For each test a Python program is provided that shows how the test can be
implemented.
Part III provides an introduction to statistical modeling and a look at advanced
statistical analysis procedures. I have also included tests on discrete data in this
section, such as logistic regression, as they utilize “generalized linear models”
which I regard as advanced. The book ends with a presentation of the basic ideas
of Bayesian statistics.