楼主: tyu1999
1167 0

[数据挖掘书籍] Advanced Analytics with Spark [推广有奖]

  • 0关注
  • 12粉丝

博士生

61%

还不是VIP/贵宾

-

威望
0
论坛币
1421 个
通用积分
195.8459
学术水平
32 点
热心指数
32 点
信用等级
30 点
经验
11090 点
帖子
149
精华
0
在线时间
74 小时
注册时间
2019-7-17
最后登录
2022-1-29

楼主
tyu1999 发表于 2019-8-8 17:23:51 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
下载地址:https://u20150046.ctfile.com/fs/20150046-392010184

大小:5.21M

格式:pdf

Patterns for Learning from Data at Scale

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example.
You’ll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques—classification, collaborative filtering, and anomaly detection among others—to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you’ll find these patterns useful for working on your own data applications.
Patterns include:
  • Recommending music and the Audioscrobbler data set
  • Predicting forest cover with decision trees
  • Anomaly detection in network traffic with K-means clustering
  • Understanding Wikipedia with Latent Semantic Analysis
  • Analyzing co-occurrence networks with GraphX
  • Geospatial and temporal data analysis on the New York City Taxi Trips data
  • Estimating financial risk through Monte Carlo simulation
  • Analyzing genomics data and the BDG project
  • Analyzing neuroimaging data with PySpark and Thunder
  • 电子书以试读为主,如果你喜欢请支持正版。

    读书改变生活,读书点亮人生,共勉



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Advanced Analytics with Spark

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-26 22:08