楼主: oliyiyi
2113 9

Data Science Primer: Basic Concepts for Beginners [推广有奖]

版主

已卖:2994份资源

泰斗

1%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
66105 个
通用积分
31671.0967
学术水平
1454 点
热心指数
1573 点
信用等级
1364 点
经验
384134 点
帖子
9629
精华
66
在线时间
5508 小时
注册时间
2007-5-21
最后登录
2025-7-8

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

楼主
oliyiyi 发表于 2017-8-12 08:07:49 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
By Matthew Mayo, KDnuggets.

What exactly is data science?

Data science is a multifaceted discipline, which encompasses machine learning and other analytic processes, statistics and related branches of mathematics, increasingly borrows from high performance scientific computing, all in order to ultimately extract insight from data and use this new-found information to tell stories.

New to this multifaceted discipline? Not sure where to being? This is a collection of short, not-too-technical overviews of particular topics of interest to data science newcomers, from basics like supervised vs. unsupervised learning to the importance of power law distributions and cognitive biases.

Data Science Basics: 3 Insights for Beginners

For data science beginners, 3 elementary issues are given overview treatment: supervised vs. unsupervised learning, decision tree pruning, and training vs. testing datasets.

Data Science Basics: Data Mining vs. Statistics

When I was first exposed to data mining and machine learning, I'll admit it: I thought it was magic. Make significant predictions with accuracy? Sorcery! Curiosity, however, quickly leads you to discover that everything is above board, and sound scientific and statistical methods bear the responsibility.

But this ends up leading to more questions in the short term. Machine learning. Data mining. Statistics. Data science. The concepts and terminology are overlapping and seemingly repetitive at times. While there are numerous attempts at clarifying much of this (permanently unsettled) uncertainty, this post will tackle the relationship between data mining and statistics.

Data Science Basics: What Types of Patterns Can Be Mined From Data?

Data mining functionality can be broken down into 4 main "problems," namely: classification and regression (together: predictive analysis); cluster analysis; frequent pattern mining; and outlier analysis. There are all sorts of other ways you could break down data mining functionality as well, I suppose, e.g. focusing on algorithms, starting with supervised versus unsupervised learning, etc. However, this is a reasonable and accepted approach to identifying what data mining is able to accomplish, and as such these problems are each covered below, with a focus on what can be solved with each "problem."

Data Science Basics: An Introduction to Ensemble Learners

This post will provide an overview of bagging, boosting, and stacking, arguably the most used and well-known of the basic ensemble methods. They are not, however, the only options. Random Forests is another example of an ensemble learner, which uses numerous decision trees in a single predictive model, and which is often overlooked and treated as a "regular" algorithm. There are other approaches to selecting effective algorithms as well, treated below.

Data Science Basics: Power Laws and Distributions

Also known as scaling laws, power laws essentially imply that a small number of occurrences of some phenomena are frequent, or very common, while a large number of of occurrences of the same phenomena are infrequent, or very rare; the exact relationship between these relative frequencies differ between power law distributions. Some of the wide array of naturally occurring and man made phenomena which power laws are able to describe include income disparities, word frequencies of a given language, city sizes, website sizes, magnitudes of earthquakes, book sales rankings, and surname popularity.

4 Cognitive Bias Key Points Data Scientists Need to Know

A few specific examples of how cognitive biases can (and do) interfere in the real world include:

  • Voters and politicians who don't understand science, but think they do, doubt climate change because it still snows in the winter (Dunning–Kruger effect)
  • Confirmation bias very recently prevented pollsters from believing any data showing that Donald Trump could win the US Presidential election



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Data Science beginners beginner Concepts concept

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

沙发
nkunku 发表于 2017-8-12 08:15:29
Data Science Primer_Basic Concepts for Beginners_Matthew Mayo
已有 1 人评分论坛币 收起 理由
oliyiyi + 10 精彩帖子

总评分: 论坛币 + 10   查看全部评分

藤椅
军旗飞扬 在职认证  发表于 2017-8-12 08:24:46
谢谢楼主分享!
已有 1 人评分经验 收起 理由
oliyiyi + 10 精彩帖子

总评分: 经验 + 10   查看全部评分

板凳
hjtoh 发表于 2017-8-12 08:47:30 来自手机
oliyiyi 发表于 2017-8-12 08:07
By Matthew Mayo, KDnuggets.
What exactly is data science?Data science is a multifaceted discipline, ...
百科式讲解
已有 1 人评分经验 收起 理由
oliyiyi + 10 精彩帖子

总评分: 经验 + 10   查看全部评分

报纸
西门高 发表于 2017-8-12 08:55:32
谢谢分享
已有 1 人评分经验 收起 理由
oliyiyi + 10 精彩帖子

总评分: 经验 + 10   查看全部评分

地板
MouJack007 发表于 2017-8-12 11:30:22
谢谢楼主分享!

7
MouJack007 发表于 2017-8-12 11:30:46

8
minixi 发表于 2017-8-12 21:50:19
谢谢分享

9
w-long 发表于 2017-8-16 23:33:55
Data Science Primer_Basic Concepts for Beginners

10
ftlsmn 发表于 2018-6-5 20:56:55
好书啊!!!

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-5 13:03