楼主: weapon3000
17553 3

[问题] Data Mining 不是一种建模的bias吗? [推广有奖]

  • 6关注
  • 1粉丝

博士生

10%

还不是VIP/贵宾

-

威望
0
论坛币
2174 个
通用积分
0
学术水平
25 点
热心指数
34 点
信用等级
20 点
经验
3859 点
帖子
260
精华
0
在线时间
192 小时
注册时间
2009-5-17
最后登录
2023-10-22

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
CFA 书上看来的,认为Data Mining 是一种不正确的行为。

有哪位大大可以解释一下疑惑。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Data Mining Mini Data ning Min 建模 Mining Data Bias

倚天照海花无数,高山流水心自知,花鸟岂知春浩荡,江山为助意纵横。
沙发
justplay16 发表于 2013-6-30 21:04:58 来自手机 |只看作者 |坛友微信交流群
也刚看到这里,楼主搞清楚了吗?

使用道具

藤椅
leileiwanqyc 发表于 2017-4-22 03:42:29 |只看作者 |坛友微信交流群
看论文正好看到这个词,就查了一查,一点拙见,欢迎大家指正。

Data mining bias (数据挖掘偏误) 是一种bias,源于数据挖掘方法本身。

“数据挖掘”是一种通过寻找历史数据的显著特征来预测未来趋势的方法。本质上讲,通过这种方法找到的规律只能反映事前联系(ex-ante relevance),能否用相同的规律预测未来是一个需要慎重考量的问题。Data mining bias指在分析数据时,由于过多依赖挖掘过程,从而将一些可能只是巧合的数据特征当做会重复出现的经济规律,由此产生的误差。

例如“一月效应”(January Effect)。通过对以往50-70年股票市场数据的研究,有人发现一月份的股票收益率相比其他月份较高,因此预测这种现象也会持续。但是由于人们都意识到这个现象并据此做出反应,如果市场effecient,则人们的市场操作会让一月效应减弱,从而产生Data mining bias“。

答案来源:http://www.investopedia.com/exam-guide/cfa-level-1/quantitative-methods/sampling-bias.asp
以下是原文:
“Data mining is the practice of searching through historical data in an effort to find significant patterns, with which researchers can build a model and make conclusions on how this population will behave in the future. "
“Data-mining bias refers to the errors that result from relying too heavily on data-mining practices. In other words, while some patterns discovered in data mining are potentially useful, many others might just be coincidental and are not likely to be repeated in the future - particularly in an "efficient" market.”
“For example, the so-called January effect, where stock market returns tend to be stronger in the month of January, is a product of data mining: monthly returns on indexes going back 50 to 70 years were sorted and compared against one another, and the patterns for the month of January were noted.”
“For example, we may not be able to continue to profit from the January effect going forward, given that this phenomenon is so widely recognized. As a result, stocks are bid for higher in November and December by market participants anticipating the January effect, so that by the start of January, the effect is priced into stocks and one can no longer take advantage of the model. Intergenerational data mining refers to the continued use of information already put forth in prior financial research as a guide for testing the same patterns and overstating the same conclusions. ”

使用道具

板凳
thinkershare 在职认证  发表于 2019-5-12 17:03:06 |只看作者 |坛友微信交流群

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-27 08:23