更多好资料可以关注我哦
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
讲Big Data的一本相当不错的书籍~~~
Book Details
- Title: Data Just Right: Introduction to Large-Scale Data & Analytics
- Author: Michael Manoochehri
- Length: 256 pages
- Edition: 1
- Language: English
- Publisher: Addison-Wesley Professional
- Publication Date: 2013-12-29
- ISBN-10: 0321898656
- ISBN-13: 9780321898654
Editorial Reviews
Making Big Data Work: Real-World Use Casesand Examples, Practical Code, Detailed Solutions
Large-scale data analysis is now vitallyimportant to virtually every business. Mobile and social technologies aregenerating massive datasets; distributed cloud computing offers the resourcesto store and analyze them; and professionals have radically new technologies attheir command, including NoSQL databases. Until now, however, most books on“Big Data” have been little more than business polemics or product catalogs.Data Just Right is different: It’s a completely practical and indispensableguide for every Big Data decision-maker, implementer, and strategist.
Michael Manoochehri, a former Googleengineer and data hacker, writes for professionals who need practical solutionsthat can be implemented with limited resources and time. Drawing on hisextensive experience, he helps you focus on building applications, rather than infrastructure,because that’s where you can derive the most value.
Manoochehri shows how to address each oftoday’s key Big Data use cases in a cost-effective way by combiningtechnologies in hybrid solutions. You’ll find expert approaches to managingmassive datasets, visualizing data, building data pipelines and dashboards,choosing tools for statistical analysis, and more. Throughout, the authordemonstrates techniques using many of today’s leading data analysis tools,including Hadoop, Hive, Shark, R, Apache Pig, Mahout, and Google BigQuery.
Coverage includes:
- Mastering the four guiding principles ofBig Data success—and avoiding common pitfalls
- Emphasizing collaboration and avoidingproblems with siloed data
- Hosting and sharing multi-terabyte datasetsefficiently and economically
- “Building for infinity” to support rapid growth
- Developing a NoSQL Web app with Redis tocollect crowd-sourced data
- Running distributed queries over massivedatasets with Hadoop, Hive, and Shark
- Building a data dashboard with GoogleBigQuery
- Exploring large datasets with advancedvisualization
- Implementing efficient pipelines fortransforming immense amounts of data
- Automating complex processing with ApachePig and the Cascading Java library
- Applying machine learning to classify, recommend,and predict incoming information
- Using R to perform statistical analysis onmassive datasets
- Building highly efficient analyticsworkflows with Python and Pandas
- Establishing sensible purchasingstrategies: when to build, buy, or outsource
- Previewing emerging trends and convergencesin scalable data technologies and the evolving role of the Data Scientist
Table of Contents
- Part I: Directives in the Big Data Era
- Chapter 1. Four Rules for Data Success
- Part II: Collecting and Sharing a Lot ofData
- Chapter 2. Hosting and Sharing Terabytes ofRaw Data
- Chapter 3. Building a NoSQL-Based Web Appto Collect Crowd-Sourced Data
- Chapter 4. Strategies for Dealing with DataSilos
- Part III: Asking Questions about Your Data
- Chapter 5. Using Hadoop, Hive, and Shark toAsk Questions about Large Datasets
- Chapter 6. Building a Data Dashboard withGoogle BigQuery
- Chapter 7. Visualization Strategies forExploring Large Datasets
- Part IV: Building Data Pipelines
- Chapter 8. Putting It Together: MapReduceData Pipelines
- Chapter 9. Building Data TransformationWorkflows with Pig and Cascading
- Part V: Machine Learning for Large Datasets
- Chapter 10. Building a Data ClassificationSystem with Mahout
- Part VI: Statistical Analysis for MassiveDatasets
- Chapter 11. Using R with Large Datasets
- Chapter 12. Building Analytics WorkflowsUsing Python and Pandas
- Part VII: Looking Ahead
- Chapter 13. When to Build, When to Buy,When to Outsource
- Chapter 14. The Future. Trends in DataTechnolog
本帖隐藏的内容
- Data Just Right Introduction to Large-Scale Data Analytics.pdf