楼主: oliyiyi
3371 25

Python vs R – Who Is Really Ahead in Data Science, Machine Learning? [推广有奖]

版主

泰斗

0%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
271951 个
通用积分
31269.3519
学术水平
1435 点
热心指数
1554 点
信用等级
1345 点
经验
383775 点
帖子
9598
精华
66
在线时间
5468 小时
注册时间
2007-5-21
最后登录
2024-4-18

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

By Gregory Piatetsky, KDnuggets.

comments

My recent analysis of KDnuggets Poll results (Python overtakes R, becomes the leader in Data Science, Machine Learning platforms) has gathered a lot of attention and generated a tremendous number of comments, discussion, and inevitable critique from proponents of both languages.

Some have complained that the poll is not scientific and voters represent a self-selected sample. That is obviously true. But KDnuggets has conducted polls since 2001 and reaches a large audience of several hundred thousand visitors each month. In our experience KDnuggets polls have been a good indicator of trends and developments in Data Mining and Data Science. We tracked R vs Python debate for several years, so unlike other sites we can compare the latest poll results with several previous years.

Let's examine other measures of Python vs R popularity among Data Scientists.

First, we analyze Google Trends (this was also done by DSC after the publication of our poll results).

Python is a much more popular language overall, and it is IEEE Spectrum No. 1 language of 2017 (thanks to Martin Skarzynski @marskar for the link), so it is unfair to compare Python and R searches directly, but we can compare Google Trends for search terms "Python data science" vs "R data science".

Here is the chart since Jan 1, 2012. Note that if you select the range that includes full months, and start in 2012, then you get smoothed monthly trends, rather than more chaotic weekly trends.

Fig. 1: Google Trends, Jan 2012 - Aug 2017, "Python data science" vs "R data science".

We note that R was slightly ahead in 2014 and 2015, as Data Science was gathering popularity, but "Python data science" searches moved ahead of "R data science" in late 2016 and are clearly ahead since January 2017.

Note: the statistics are the same regardless of how Data Science is capitalized: "Data Science" or "data science", but Google autocomplete suggests "data science" for both Python and R.

However, recently Machine Learning has become very popular - see my post Machine Learning overtaking Big Data? (May 2017), so let's examine Python vs R for "Machine Learning" in Google Trends.


Fig. 2: Google Trends, Jan 2012 - Aug 2017, "Python Machine Learning", "R Machine Learning", "Python data science", and "R data science".

We see that "Python Machine Learning" is way ahead of "Python data science", and both are significantly ahead of "R data science" and "R Machine Learning".

Relative search volume for Aug 2017 is
  • Python Machine Learning: 100
  • Python data science: 49
  • R data science: 33
  • R Machine Learning: 32
(Note: while Google autocomplete suggests search term "Python data science", with lower-case "data science", it suggests Capitalized search term "Python Machine Learning". There is probably some deep meaning here ... )


Fig. 3: Snapshot of indeed.com Data Scientist job ads in USA that also include Python and/or R, Sep 2017
Next, let's look at job ads on indeed.com. All numbers below are for jobs in USA as of Sep 11, 2017.We represent this relationship in a Venn Diagram on the right.

Indeed job trends below also show that demand for Data Scientists that know Python and those that know R has been very close until very recently, and these represent significant portion of all Data Scientist jobs.

Fig. 4: Indeed "Data Scientist", "Data Scientist" Python, and "Data Scientist" R Job Trends, 2014-2017

These job ad counts suggest that current employers see most Data Scientists as able to use both Python and R as needed, but Python has a small advantage at the moment.

Google trend results suggest that Python advantage will grow and Python-related Data Science and Machine Learning jobs will grow faster than those related to R.

Note: with indeed.com you need to specify the search string carefully, and search for [Data Scientist Python] will include many jobs that have either Data or Scientist but not necessarily both.

Finally, among many comments on my original post Python overtakes R in Data Science I want to highlight two observations:
  • Stanislav Seltser notes that among top 15 languages on the github https://octoverse.github.com, Python is no. 3 while R is not on the list.
  • Stanislav also noted Kaggle 2016 Year Summary which says
    In past years, R was the language of choice on Kaggle, but 2016 has seen Python emerge as a clear winner when it came to the number of kernels written.




二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝


缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html
沙发
hjtoh 发表于 2017-9-20 08:14:32 来自手机 |只看作者 |坛友微信交流群
oliyiyi 发表于 2017-9-20 08:07
**** 本内容被作者隐藏 ****
都很强大

使用道具

藤椅
felixzhao123 发表于 2017-9-20 08:16:21 |只看作者 |坛友微信交流群
R的支持者看看什么观点

使用道具

板凳
西门高 发表于 2017-9-20 08:16:39 |只看作者 |坛友微信交流群
谢谢分享

使用道具

报纸
kavakava 在职认证  发表于 2017-9-20 08:22:36 |只看作者 |坛友微信交流群
Thanks

使用道具

地板
mxnmxx 发表于 2017-9-20 08:23:56 |只看作者 |坛友微信交流群
看看,谢谢!

使用道具

7
ekscheng 发表于 2017-9-20 08:26:04 |只看作者 |坛友微信交流群

使用道具

8
钱学森64 发表于 2017-9-20 08:26:06 |只看作者 |坛友微信交流群
谢谢分享

使用道具

9
rearey 发表于 2017-9-20 08:27:17 |只看作者 |坛友微信交流群
 thanks very much.

使用道具

个人喜欢R

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-25 21:09