楼主: isyegatech
5222 10

[其他] 【完整版书籍】 Pandas for Everyone: Python Data Analysis [推广有奖]

  • 0关注
  • 4粉丝

硕士生

93%

还不是VIP/贵宾

-

威望
0
论坛币
209046 个
通用积分
6.6396
学术水平
13 点
热心指数
19 点
信用等级
5 点
经验
3425 点
帖子
28
精华
0
在线时间
353 小时
注册时间
2008-5-22
最后登录
2024-2-25

相似文件 换一批

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币


The Hands-On, Example-Rich Introduction to Pandas Data Analysis in Python


Today, analysts must manage data characterized by extraordinary variety, velocity, and volume. Using the open source Pandas library, you can use Python to rapidly automate and perform virtually any data analysis task, no matter how large or complex. Pandas can help you ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets.


Pandas for Everyone brings together practical knowledge and insight for solving real problems with Pandas, even if you’re new to Python data analysis. Daniel Y. Chen introduces key concepts through simple but practical examples, incrementally building on them to solve more difficult, real-world problems.


Chen gives you a jumpstart on using Pandas with a realistic dataset and covers combining datasets, handling missing data, and structuring datasets for easier analysis and visualization. He demonstrates powerful data cleaning techniques, from basic string manipulation to applying functions simultaneously across dataframes.


Once your data is ready, Chen guides you through fitting models for prediction, clustering, inference, and exploration. He provides tips on performance and scalability, and introduces you to the wider Python data analysis ecosystem.

  • Work with DataFrames and Series, and import or export data
  • Create plots with matplotlib, seaborn, and pandas
  • Combine datasets and handle missing data
  • Reshape, tidy, and clean datasets so they’re easier to work with
  • Convert data types and manipulate text strings
  • Apply functions to scale data manipulations
  • Aggregate, transform, and filter large datasets with groupby
  • Leverage Pandas’ advanced date and time capabilities
  • Fit linear models using statsmodels and scikit-learn libraries
  • Use generalized linear modeling to fit models with different response variables
  • Compare multiple models to select the “best”
  • Regularize to overcome overfitting and improve performance
  • Use clustering in unsupervised machine learning

Editorial ReviewsAbout the Author

Daniel Chen is a graduate student in the interdisciplinary PhD program in Genetics, Bioinformatics & Computational Biology (GBCB) at Virginia Tech. He is involved with Software Carpentry as an instructor and lesson maintainer. He completed his master’s degree in public health at Columbia University Mailman School of Public Health in Epidemiology, and currently works at the Social and Decision Analytics Laboratory under the Biocomplexity Institute of Virginia Tech where he is working with data to inform policy decision-making. He is the author of Pandas for Everyone and Pandas Data Analysis with Python Fundamentals LiveLessons.


pandas-for-everyone.epub (36.01 MB, 需要: 65535 个论坛币)





二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝


本帖被以下文库推荐

沙发
军旗飞扬 发表于 2018-1-26 10:38:28 |只看作者 |坛友微信交流群

使用道具

藤椅
jjxm20060807 发表于 2018-1-26 20:30:15 |只看作者 |坛友微信交流群
谢谢分享

使用道具

板凳
phipe 发表于 2018-1-27 08:41:47 |只看作者 |坛友微信交流群
谢谢分享

使用道具

报纸
hifinecon 发表于 2018-7-15 11:06:54 |只看作者 |坛友微信交流群
very nice book for R

使用道具

地板
e0g411k014z 学生认证  发表于 2018-7-21 22:08:20 |只看作者 |坛友微信交流群
xiexie louzhu

使用道具

7
jydcb003 学生认证  发表于 2020-1-27 14:27:25 |只看作者 |坛友微信交流群
非常好的一本书,谢谢。

使用道具

8
saiwaipiaoling 发表于 2020-11-18 15:17:08 |只看作者 |坛友微信交流群
非常好的一本书,谢谢。

使用道具

9
qgjtso111 发表于 2020-12-25 21:20:34 |只看作者 |坛友微信交流群
谢谢分享

使用道具

10
bzm100 发表于 2021-1-15 15:02:25 |只看作者 |坛友微信交流群

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jr
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-19 11:36