请选择 进入手机版 | 继续访问电脑版
楼主: cwh2008
2449 3

[Hadoop] Hadoop Blueprints [推广有奖]

  • 0关注
  • 1粉丝

硕士生

5%

还不是VIP/贵宾

-

威望
0
论坛币
291 个
通用积分
0.0600
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
1090 点
帖子
95
精华
0
在线时间
140 小时
注册时间
2008-9-3
最后登录
2023-2-11

cwh2008 发表于 2017-10-28 12:30:54 |显示全部楼层 |坛友微信交流群
相似文件 换一批

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Preface
This book covers the application of Hadoop and its ecosystem of tools to solve business
problems. Hadoop has fast emerged as the leading big data platform and finds applications
in many industries where massive datasets or big data has to be stored and analyzed.
Hadoop lowers the cost of investment in the storage. It supports the generation of new
business insights, which was not possible earlier because of the massive volumes and
computing capacity required to process such information. This book covers several
business cases to build solutions to business problems. Each solution covered in this book
has been built using Hadoop and HDFS and the set of tools from the Hadoop ecosystem.

What this book covers
Chapter 1, Hadoop and Big Data, goes over how Hadoop has played a pivotal role in
making several Internet businesses successful with big data from its beginnings in the
previous decade. This chapter covers a brief history and the story of the evolution of
Hadoop. It covers the Hadoop architecture and the MapReduce data processing
framework. It introduces basic Hadoop programming in Java and provides a detailed
overview of the business cases covered in the following chapters of this book. This
chapter builds the foundation for understanding the rest of the book.
Chapter 2, A 360-Degree View of the Customer, covers building a 360-degree view of
the customer. A good 360-degree view requires the integration of data from various
sources. The data sources are database management systems storing master data and
transactional data. Other data sources might include data captured from social media
feeds. In this chapter, we will be integrating data from CRM systems, web logs, and
Twitter feeds to build the 360-degree view and present it using a simple web interface. We
will learn about Apache Sqoop and Apache Hive in the process of building our solution.
Chapter 3, Building a Fraud Detection System, covers the building of a real-time fraud
detection system. This system predicts whether a financial transaction could be fraudulent
by applying a clustering algorithm on a stream of transactions. We will learn about the
architecture of the system and the coding steps involved in building the system. We will
learn about Apache Spark in the process of building our solution.
Chapter 4, Marketing Campaign Planning, shows how to build a system that can improve
the effectiveness of marketing campaigns. This system is a batch analytics system that
uses historical campaign-response data to predict who is going to respond to a marketing
folder. We will see how we can build a predictive model and use it to predict who is going
to respond to which folder in our marketing campaign. We will learn about BigML in the
process of building our solution.
Chapter 5, Churn Detection, explains how to use Hadoop to predict which customers are
likely to move over to another company. We will cover the business case of a mobile
telecom provider who would like to detect the customers who are likely to churn. These
customers are given special incentives so that they can stay with the same provider. We
will apply Bayes’ Theorem to calculate the likelihood of churn. The model for churn
detection will be built using Hadoop. We will learn about writing MapReduce programs in
Java in the process of building our solution.
Chapter 6, Analyze Sensor Data Using Hadoop, is about how to build a system to analyze
sensor data. Nowadays, sensors are considered an important source of big data. We will
learn how Hadoop and big-data technologies can be helpful in the Internet of Things (IoT)
domain. IoT is a network of connected devices that generate data through sensors. We will
build a system to monitor the quality of the environment, such as humidity and
temperature, in a factory. We will introduce Apache Kafka, Grafana, and OpenTSDB tools
in the process of building the solution.
https://www.iteblog.com
Chapter 7, Building a Data Lake, takes you through building a data lake using Hadoop and
several other tools to import data in a data lake and provide secure access to the data. Data
lakes are a popular business case for Hadoop. In a data lake, we store data from multiple
sources to build a single source of data for the enterprise and build a security layer around
it. We will learn about Apache Ranger, Apache Flume, and Apache Zeppelin in the
process of building our solution.
Chapter 8, Future Directions, covers four separate topics that are relevant to Hadoop-
based projects. These topics are building a Hadoop solutions team, Hadoop on the cloud,
NoSQL databases, and in-memory databases. This chapter does not include any coding
examples, unlike the other chapters. These fours topics have been covered in the essay
form so that you can explore them further.

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Blueprint eprints Hadoop print Blue

Hadoop Blueprints.pdf

14.3 MB

需要: 3 个论坛币  [购买]

Hadoop Blueprints.pdf

本帖被以下文库推荐

军旗飞扬 发表于 2017-10-28 21:55:07 |显示全部楼层 |坛友微信交流群
谢谢分享

使用道具

bearfighting 发表于 2017-10-28 23:14:16 |显示全部楼层 |坛友微信交流群
十分感谢

使用道具

franky_sas 发表于 2017-11-5 12:58:06 |显示全部楼层 |坛友微信交流群

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-3-29 03:00