楼主: 飞天玄舞6
1975 0

【独家发布】Real-time Analytics with Storm and Cassandra [推广有奖]

  • 3关注
  • 31粉丝

VIP1

学科带头人

12%

(VIP/贵宾)九级

73%

TA的文库  其他...

综合文库

威望
0
论坛币
154039 个
通用积分
4221.5795
学术水平
128 点
热心指数
148 点
信用等级
102 点
经验
76624 点
帖子
1503
精华
0
在线时间
1509 小时
注册时间
2013-12-2
最后登录
2021-10-20

相似文件 换一批

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Real-time Analytics with Storm and Cassandra
Preface vii
Chapter 1: Let's Understand Storm 1
Distributed computing problems 1
Real-time business solution for credit or debit card fraud detection 2
Aircraft Communications Addressing and Reporting system 2
Healthcare 3
Other applications 4
Solutions for complex distributed use cases 4
The Hadoop solution 4
A custom solution 6
Licensed proprietary solutions 7
Other real-time processing tools 8
A high-level view of various components of Storm 8
Delving into the internals of Storm 9
Quiz time 11
Summary 12
Chapter 2: Getting Started with Your First Topology 13
Prerequisites for setting up Storm 14
Components of a Storm topology 14
Spouts 15
Bolts 17
Streams 19
Tuples – the data model in Storm 19
Executing a sample Storm topology – local mode 19
WordCount topology from the Storm-starter project 20
Executing the topology in the distributed mode 22
Set up Zookeeper (V 3.3.5) for Storm 22
Setting up Storm in the distributed mode 25
Table of Contents
[ ii ]
Launching Storm daemons 28
Executing the topology from Command Prompt 28
Tweaking the WordCount topology to customize it 29
Quiz time 31
Summary 32
Chapter 3: Understanding Storm Internals by Examples 33
Customizing Storm spouts 33
Creating FileSpout 34
Tweaking WordCount topology to use FileSpout 36
The SocketSpout class 37
Anchoring and acking 38
The unreliable topology 39
Stream groupings 39
Local or shuffle grouping 40
Fields grouping 41
All grouping 41
Global grouping 42
Custom grouping 43
Direct grouping 43
Quiz time 44
Summary 44
Chapter 4: Storm in a Clustered Mode 45
The Storm cluster setup 45
Zookeeper configurations 46
Cleaning up Zookeeper 47
Storm configurations 48
Storm logging configurations 50
The Storm UI 52
Section 1 53
Section 2 54
Section 3 55
Section 4 55
The visualization section 56
Storm monitoring tools 57
Quiz time 60
Summary 61
Chapter 5: Storm High Availability and Failover 63
An overview of RabbitMQ 64
Installing the RabbitMQ cluster 64
Prerequisites for the setup of RabbitMQ 65
Setting up a RabbitMQ server 65
Table of Contents
[ iii ]
Testing the RabbitMQ server 66
Creating a RabbitMQ cluster 67
Enabling the RabbitMQ UI 68
Creating mirror queues for high availability 69
Integrating Storm with RabbitMQ 70
Creating a RabbitMQ feeder component 75
Wiring the topology for the AMQP spout 77
Building high availability of components 77
High availability of the Storm cluster 78
Guaranteed processing of the Storm cluster 79
The Storm isolation scheduler 80
Quiz time 82
Summary 82
Chapter 6: Adding NoSQL Persistence to Storm 83
The advantages of Cassandra 83
Columnar database fundamentals 84
Types of column families 85
Types of columns 86
Setting up the Cassandra cluster 87
Installing Cassandra 88
Multiple data centers 89
Prerequisites for setting up multiple data centers 90
Installing Cassandra data centers 90
Introduction to CQLSH 92
Introduction to CLI 93
Using different client APIs to access Cassandra 95
Storm topology wired to the Cassandra store 97
The best practices for Storm/Cassandra applications 103
Quiz time 103
Summary 104
Chapter 7: Cassandra Partitioning, High Availability,
and Consistency 105
Consistent hashing 105
One or more node goes down 107
One or more node comes back up 108
Replication in Cassandra and strategies 109
Cassandra consistency 110
Write consistency 111
Read consistency 112
Consistency maintenance features 113
Table of Contents
[ iv ]
Quiz time 114
Summary 115
Chapter 8: Cassandra Management and Maintenance 117
Cassandra – gossip protocol 118
Bootstrapping 118
Failure scenario handling – detection and recovery 118
Cassandra cluster scaling – adding a new node 119
Cassandra cluster – replacing a dead node 121
The replication factor 122
The nodetool commands 123
Cassandra fault tolerance 126
Cassandra monitoring systems 126
JMX monitoring 126
Datastax OpsCenter 129
Quiz time 130
Summary 131
Chapter 9: Storm Management and Maintenance 133
Scaling the Storm cluster – adding new supervisor nodes 133
Scaling the Storm cluster and rebalancing the topology 136
Rebalancing using the GUI 136
Rebalancing using the CLI 136
Setting up workers and parallelism to enhance processing 137
Scenario 1 138
Scenario 2 139
Scenario 3 140
Storm troubleshooting 140
The Storm UI 141
Storm logs 145
Quiz time 148
Summary 148
Chapter 10: Advance Concepts in Storm 149
Building a Trident topology 149
Understanding the Trident API 154
Local partition manipulation operation 154
Functions 155
Filters 156
partitionAggregate 156
Operations related to stream repartitioning 160
Data aggregations over the streams 161
Table of Contents
[ v ]
Grouping over a field in a stream 161
Merge and join 162
Examples and illustrations 163
Quiz time 164
Summary 165
Chapter 11: Distributed Cache and CEP with Storm 167
The need for distributed caching in Storm 167
Introduction to memcached 169
Setting up memcache 171
Building a topology with a cache 173
Introduction to the complex event processing engine 175
Esper 176
Getting started with Esper 177
Integrating Esper with Storm 180
Quiz time 184
Summary 184
Appendix: Quiz Answers 185
Index 189

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Real-time Analytics Analytic Sandra Storm computing business problems solution complex

Real-time Analytics with Storm and Cassandra.pdf

11.66 MB

需要: 5 个论坛币  [购买]

本帖被以下文库推荐

strive for the best, prepare for the worst.
您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-20 08:09