请选择 进入手机版 | 继续访问电脑版
楼主: oliyiyi
861 0

The Most Popular Language For Machine Learning and Data Science Is … [推广有奖]

版主

泰斗

0%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
271951 个
通用积分
31269.3519
学术水平
1435 点
热心指数
1554 点
信用等级
1345 点
经验
383775 点
帖子
9598
精华
66
在线时间
5468 小时
注册时间
2007-5-21
最后登录
2024-4-18

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

oliyiyi 发表于 2017-2-4 14:34:47 |显示全部楼层 |坛友微信交流群

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

By Jean-Francois Puget, IBM.

What programming language should one learn to get a machine learning or data science job?  That’s the silver bullet question.  It is debated in many forums.  I could provide here my own answer to it and explain why, but I’d rather look at some data first.  After all, this is what machine learners and data scientists should do: look at data, not opinions.

So, let’s look at some data.  I will use the trend search available on indeed.com.  It looks for occurrences over time of selected terms in job offers.  It gives an indication of what skills employers are seeking.  Note however that it is not a poll on which skills are effectively in use.  It is rather an advanced indicator of how skill popularity evolve (more formally, it is probably close to the first order derivative of popularity as the latter is the difference of hiring skills plus retraining skills minus retiring and leaving skills).

Enough speaking, let’s get data.  I searched for skills used in conjunction with “machine learning” and “data science”, where skills are one of the prominent programming language Java, C, C++, and Javascript.  I also included Python and R which we know are popular for machine learning and data science, as well as Scala given its link to Spark, and Julia that some think is the next big thing here.  Running this query we get the data we are looking for:

When we focus on machine learning, we get similar data:

What can we derive from this data?

First of all, we see that one size does not fit all.  A number of languages are fairly popular in this context.

Second, there is a sharp increase of popularity for all these, reflecting the increased interest in machine learning and data science over the last few years.

Third, Python is the clear leader, followed by Java, then R, then C++.  Python lead over Java is increasing, while the lead of Java over R is decreasing.  I must admit I have been surprised to see Java at the second place, I was expecting R instead.

Fourth, Scala growth is impressive.  It was almost non existent 3 years ago, and is now in the same ballpark as more established languages.  This is easier to spot when we switch to the relative view of the data on indeed.com:

Fifth, Julia popularity is not anywhere near the other, but there is definitely an uptick in the recent months.  Will Julia turn in one of the popular languages for machine learning and data science?  Future will tell.

If we ignore Scala and Julia in order to be able to zoom on the other languages growth, then we confirm that Python and R grow faster than general purpose languages.

It maybe that R popularity will pass that of Java soon given the difference in growth rate.

When we focus on deep learning with this query, the data is quite different:

There, Python is still the leader, but C++ is now second, then Java, and C at fourth place.  R is only at the fifth rank. There is clearly an emphasis on high performance computing languages here.  Java is growing fast though.  It could reach second place soon, as for machine learning in general.  R isn’t going to be near the top anytime soon.  What surprises me is the the absence of Lua, although it is used in one of the major deep learning frameworks (Torch).  Julia isn’t present either.

The answer to the original question should now be clear.  Python, Java, and R are most popular skills when it comes to machine learning and data science jobs.  If you want to focus on deep learning rather than machine learning in general, then C++, and to some lesser extent C, are also worth considering.  Remember however, that this is only one way of looking at the problem.  You may get a different answer if you are looking for a job in academia, or if you just want to have fun learning about machine learning and data science during your spare time.

What about my personal answer?  I answered earlier this year in this blog. Besides having support from many top machine learning frameworks, Python is good fit for me because I have a computer science background.  I would also feel comfortable with C++ for developing new algorithms, given I’ve programmed in that language for most of my professional life.  But this is me, and people with different background may feel better with another language.  A statistician with limited programming skills will certainly prefer R.  A strong Java developer can stay with his favorite language as there are significant open sources with Java api.  And a case can certainly be made for any of the languages on these charts.

Therefore, my advice would be to read other blogs discussing the same question before investing significant time in learning a language.

Update on December 23, 2016.  This post is discussed on HackerNews.

Original post. Reposted with permission.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Data Science R language Language Learning Science available learning question machine provide

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html
您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-19 06:56