补充个outline吧还是:
Spark01
1. Introduction to Data Analysis with Spark. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Downloading Spark and Getting Started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3. Programming with RDDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4. Working with Key/Value Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5. Loading and Saving Your Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6. Advanced Spark Programming. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7. Running on a Cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8. Tuning and Debugging Spark. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9. Spark SQL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
10. Spark Streaming. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
11. Machine Learning with MLlib. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Spark02
1. Analyzing Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Introduction to Data Analysis with Scala and Spark. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3. Recommending Music and the Audioscrobbler data set. . . . . . . . . . . . . . . . . . . . . . . . . . 37
4. Predicting Forest Cover with Decision Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5. Anomaly Detection in Network Traffic with K-means clustering. . . . . . . . . . . . . . . . . . . 79
6. Understanding Wikipedia with Latent Semantic Analysis. . . . . . . . . . . . . . . . . . . . . . . . . 9
7. Analyzing Co-occurrence Networks with GraphX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8. Geospatial and Temporal Data Analysis on the New York City Taxicab Data. . . . . . . . . 145
9. Financial Risk through Monte Carlo Simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
10. Analyzing Genomics Data and the BDG Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11. Analyzing Neuroimaging Data with PySpark and Thunder. . . . . . . . . . . . . . . . . . . . . . . 207
12. Appendix: Deeper Into Spark. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
13. Appendix: Upcoming MLlib Pipelines API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Spark03
Chapter 1: Getting Up and Running with Spark 7
Chapter 2: Designing a Machine Learning System 37
Chapter 3: Obtaining, Processing, and Preparing Data with Spark 51
Chapter 4: Building a Recommendation Engine with Spark 83
Chapter 5: Building a Classification Model with Spark 117
Chapter 6: Building a Regression Model with Spark 161
Chapter 7: Building a Clustering Model with Spark 197
Chapter 8: Dimensionality Reduction with Spark 221
Chapter 9: Advanced Text Processing with Spark 247
Chapter 10: Real-time Machine Learning with Spark Streaming 279
|