人大经济论坛 › 论坛 › 计量经济学与统计论坛五区 › 计量经济学与统计软件 › LATEX论坛 › Top Algorithms Used by Data Scientists

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

发帖

楼主: janyiyi

762 0

Top Algorithms Used by Data Scientists [推广有奖]

3关注
17粉丝

讲师

27%

还不是VIP/贵宾

威望: 0 级
论坛币: 3206 个
通用积分: 5056.6800
学术水平: 539 点
热心指数: 537 点
信用等级: 538 点
经验: 10157 点
帖子: 300
精华: 2
在线时间: 90 小时
注册时间: 2010-10-3
最后登录: 2024-4-6

楼主

janyiyi 发表于 2016-9-14 07:54:43 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

ere are the results, based on 844 voters.

The top 10 algorithms and their share of voters are:

Fig. 1: Top 10 algorithms used by Data Scientists.
See full table of all algorithms at the end of the post.

The average respondent used 8.1 algorithms, a big increase vs a similar poll in 2011.

Comparing with 2011 Poll Algorithms for data analysis / data miningwe note that the top methods are still Regression, Clustering, Decision Trees/Rules, and Visualization. The biggest relative increases, measured by (pct2016 /pct2011 - 1) are for

Boosting, up 40% to 32.8% share in 2016 from 23.5% share in 2011
Text Mining, up 30% to 35.9% from 27.7%
Visualization, up 27% to 48.7% from 38.3%
Time series/Sequence analysis, up 25% to 37.0% from 29.6%
Anomaly/Deviation detection, up 19% to 19.5% from 16.4%
Ensemble methods, up 19% to 33.6% from 28.3%
SVM, up 18% to 33.6% from 28.6%
Regression, up 16% to 67.1% from 57.9%

Most popular among new options added in 2016 are

K-nearest neighbors, 46% share
PCA, 43%
Random Forests, 38%
Optimization, 24%
Neural networks - Deep Learning, 19%
Singular Value Decomposition, 16%

The biggest declines are for

Association rules, down 47% to 15.3% from 28.6%
Uplift modeling, down 36% to 3.1% from 4.8% (that is a surprise, given strong results published)
Factor Analysis, down 24% to 14.2% from 18.6%
Survival Analysis, down 15% to 7.9% from 9.3%

The following table shows usage of different algorithms types: Supervised, Unsupervised, Meta, and other by Employment type. We excluded NA (4.5%) and Other (3%) employment types.

Table 1: Algorithm usage by Employment Type

Employment Type	% Voters	Avg Num Algorithms Used	% Used Super- vised	% Used Unsuper- vised	% Used Meta	% Used Other Methods
Industry	59%	8.4	94%	81%	55%	83%
Government/Non-profit	4.1%	9.5	91%	89%	49%	89%
Student	16%	8.1	94%	76%	47%	77%
Academia	12%	7.2	95%	81%	44%	77%
All		8.3	94%	82%	48%	81%

We note that almost everyone uses supervised learning algorithms.
Government and Industry Data Scientists used more different types of algorithms than students or academic researchers,
and Industry Data Scientists were more likely to use Meta-algorithms.

Next, we analyzed the usage of top 10 algorithms + Deep Learning by employment type.

Table 2: Top 10 Algorithms + Deep Learning usage by Employment Type

Algorithm	Industry	Government/Non-profit	Academia	Student	All
Regression	71%	63%	51%	64%	67%
Clustering	58%	63%	51%	58%	57%
Decision	59%	63%	38%	57%	55%
Visualization	55%	71%	28%	47%	49%
K-NN	46%	54%	48%	47%	46%
PCA	43%	57%	48%	40%	43%
Statistics	47%	49%	37%	36%	43%
Random Forests	40%	40%	29%	36%	38%
Time series	42%	54%	26%	24%	37%
Text Mining	36%	40%	33%	38%	36%
Deep Learning	18%	9%	24%	19%	19%

To make the differences easier to see, we compute the algorithm bias for a particular employment type relative to average algorithm usage as Bias(Alg,Type)=Usage(Alg,Type)/Usage(Alg,All) - 1.

Fig. 2: Algorithm usage bias by Employment.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏3 回帖

关键词：Scientists Algorithms Scientist Algorithm Data increase relative average methods similar

Top Algorithms Used by Data Scientists [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

本版微信群

Top Algorithms Used by Data Scientists [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

本版微信群

扫码加我拉你入群