Data analysis is at least as much art as it is science. This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials such as:
- 10 things statistics taught us about big data analysis
- The Leek Group Guide to R packages
- How to share data with a statistician
The author is one of the co-developers of the Johns Hopkins Specialization in Data Science the largest data science program in the world that has enrolled more than 1.76 million people. The book is useful as a companion to introductory courses in data science or data analysis. It is also a useful reference tool for people tasked with reading and critiquing data analyses.
Table of Contents- 1. Introduction
- 2. The data analytic question
- 3. Tidying the data
- 4. Checking the data
- 5. Exploratory analysis
- 6. Statistical modeling and inference
- 7. Prediction and machine learning
- 8. Causality
- 9. Written analyses
- 10. Creating figures
- 11. Presenting data
- 12. Reproducibility
- 13. A few matters of form
- 14. The data analysis checklist
- 15. Additional resources