请选择 进入手机版 | 继续访问电脑版
楼主: neuroexplorer
2468 5

【独家发布】Spark for Python Developers [推广有奖]

  • 5关注
  • 23粉丝

学科带头人

79%

还不是VIP/贵宾

-

威望
0
论坛币
29106 个
通用积分
844.6045
学术水平
53 点
热心指数
70 点
信用等级
58 点
经验
176572 点
帖子
3222
精华
0
在线时间
1394 小时
注册时间
2013-7-21
最后登录
2024-3-17

相似文件 换一批

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Spark for Python Developers.pdf (4.4 MB, 需要: 5 个论坛币)



About This Book
  • Set up real-time streaming and batch data intensive infrastructure using Spark and Python
  • Deliver insightful visualizations in a web app using Spark (PySpark)
  • Inject live data using Spark Streaming with real-time events


Who This Book Is For

This book is for data scientists and software developers with a focus on Python who want to work with the Spark engine, and it will also benefit Enterprise Architects. All you need to have is a good background of Python and an inclination to work with Spark.







Table of Contents1: SETTING UP A SPARK VIRTUAL ENVIRONMENT

2: BUILDING BATCH AND STREAMING APPS WITH SPARK

3: JUGGLING DATA WITH SPARK

4: LEARNING FROM DATA USING SPARK

5: STREAMING LIVE DATA WITH SPARK

6: VISUALIZING INSIGHTS AND TRENDS




What You Will Learn
  • Create a Python development environment powered by Spark (PySpark), Blaze, and Bookeh
  • Build a real-time trend tracker data intensive app
  • Visualize the trends and insights gained from data using Bookeh
  • Generate insights from data using machine learning through Spark MLLIB
  • Juggle with data using Blaze
  • Create training data sets and train the Machine Learning models
  • Test the machine learning models on test datasets
  • Deploy the machine learning algorithms and models and scale it for real-time events


In Detail

Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answer—an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms.

Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask.

To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop.

You’ll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complexities. You’ll explore datasets using iPython Notebook and will discover how to optimize the data models and pipeline. Finally, you’ll get to know how to create training datasets and train the machine learning models.

By the end of the book, you will have created a real-time and insightful trend tracker data-intensive app with Spark.



AuthorsAmit Nandi

Amit Nandi studied physics at the Free University of Brussels in Belgium, where he did his research on computer generated holograms. Computer generated holograms are the key components of an optical computer, which is powered by photons running at the speed of light. He then worked with the university Cray supercomputer, sending batch jobs of programs written in Fortran. This gave him a taste for computing, which kept growing. He has worked extensively on large business reengineering initiatives, using SAP as the main enabler. He focused for the last 15 years on start-ups in the data space, pioneering new areas of the information technology landscape. He is currently focusing on large-scale data-intensive applications as an enterprise architect, data engineer, and software developer. He understands and speaks seven human languages. Although Python is his computer language of choice, he aims to be able to write fluently in seven computer languages too.








二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Developers developer Develop python Spark spark python

已有 1 人评分论坛币 收起 理由
william9225 + 60 精彩帖子

总评分: 论坛币 + 60   查看全部评分

本帖被以下文库推荐

使用道具

thanks for sharing

使用道具

Excellent Book: Spark for Python Developers

使用道具

sacromento 学生认证  发表于 2017-6-9 04:27:02 |显示全部楼层 |坛友微信交流群
谢谢分享啊

使用道具

xingyuchen 学生认证  发表于 2018-2-12 00:43:10 |显示全部楼层 |坛友微信交流群
谢谢分享!一直想学pyspark!

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-3-29 04:54