楼主: Lisrelchen
1182 1

Optimizing Hadoop for MapReduce [推广有奖]

  • 0关注
  • 62粉丝

VIP

已卖:4194份资源

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
50288 个
通用积分
83.6906
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

楼主
Lisrelchen 发表于 2015-4-11 10:40:59 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

Optimizing Hadoop for MapReduce


Khaled Tannir


February 2014


This book is the perfect introduction to sophisticated concepts in MapReduce and will ensure you have the knowledge to optimize job performance. This is not an academic treatise; it’s an example-driven tutorial for the real world.

What You Will Learn
  • Learn about the factors that affect MapReduce performance
  • Utilize the Hadoop MapReduce performance counters to identify resource bottlenecks
  • Size your Hadoop cluster’s nodes
  • Set the number of mappers and reducers correctly
  • Optimize mapper and reducer task throughput and code size using compression and Combiners
  • Understand the various tuning properties and best practices to optimize clusters


In Detail

MapReduce is the distribution system that the Hadoop MapReduce engine uses to distribute work around a cluster by working parallel on smaller data sets. It is useful in a wide range of applications, including distributed pattern-based searching, distributed sorting, web link-graph reversal, term-vector per host, web access log stats, inverted index construction, document clustering, machine learning, and statistical machine translation.

This book introduces you to advanced MapReduce concepts and teaches you everything from identifying the factors that affect MapReduce job performance to tuning the MapReduce configuration. Based on real-world experience, this book will help you to fully utilize your cluster’s node resources to run MapReduce jobs optimally.

This book details the Hadoop MapReduce job performance optimization process. Through a number of clear and practical steps, it will help you to fully utilize your cluster’s node resources.

Starting with how MapReduce works and the factors that affect MapReduce performance, you will be given an overview of Hadoop metrics and several performance monitoring tools. Further on, you will explore performance counters that help you identify resource bottlenecks, check cluster health, and size your Hadoop cluster. You will also learn about optimizing map and reduce tasks by using Combiners and compression.

The book ends with best practices and recommendations on how to use your Hadoop cluster optimally.

本帖隐藏的内容

Optimizing Hadoop for MapReduce_2014.2.pdf (1.8 MB, 需要: 20 个论坛币)






二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Optimizing MapReduce reduce Hadoop Optim knowledge concepts identify perfect factors

本帖被以下文库推荐

沙发
igs816(未真实交易用户) 在职认证  发表于 2015-4-11 16:44:51
Hadoop 数据源是关键

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-24 12:19