楼主: myccc
990 0

[学习分享] More Ways to get a Scoring Model wrong [推广有奖]

  • 2关注
  • 0粉丝

已卖:501份资源

本科生

66%

还不是VIP/贵宾

-

威望
0
论坛币
663 个
通用积分
0
学术水平
3 点
热心指数
6 点
信用等级
1 点
经验
866 点
帖子
74
精华
0
在线时间
112 小时
注册时间
2011-11-18
最后登录
2019-7-24

楼主
myccc 发表于 2013-11-18 11:23:41 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
  • Typo
  • Refuse to use central tendency to patch missing values. Instead, assign highest response rate because WOE says so
  • Marketing people tell me to force the variable into the model
  • Selection bias
  • Forgot to segment
  • Solely rely on data to segment without consulting the biz side
  • Just delete observations with missing values, OK, without studying geometricl boundaries
  • Using oversampling, but refuse to weight it back. That boosts lift, right? Let us do 50-50
  • Insist random sampling is sufficient, while stratified sampling is critical
  • Binning too much, or two little
  • Selecting variables without repeated sampling
  • Forgot to exclude numeric customer id from the candidate variables. AND,it pops….Well, both Unica and Kxen accepted it, So I see no problem
  • When the same variable is sourced by different vendors, did not look up the scales under the same name. Just combine them
  • Well, SAS Enterprise Miner gave me this model yesterday
  • The binary variable is statistically significant, but there are only 27 event=1, out of ~1mm, since only 27 made some purchases..
  • Well, I only have 250 events=1. But I think I can use exact logistic to make it up, all right? I got a PHD in Statistics, Trust me, my professor is OK with it. I just called her.
  • Build two-stage model without Heckman adjustment
  • Use global mean over the WHOLE customer base to replace missing value on a much smaller universe/subset. So average networth of a high networth client group has 22% worth only 225K
  • I just spent the past two days boosting R-square. Now it is 92. Great.
  • Forgot to set descending option in proc logistic in SAS
  • I think we should hold out missing values when conducting EDA.
  • Without proper separation of ‘treatment and control
  • Treat business entities and individuals as equal and mix them in the same universe
  • Runing clustering without validation
  • Running discriminant model without validation. So correct classification rate on development is 89% and that over validation is …35%.(no wonder you finished it in two hours and came here to ask me for a raise)
  • Disregard link function in multi-nomil models
  • I think this is a better variable: xnew=y*y*y*. It is the top variable dominating others.
  • Use standardized coefficient to calculate relative importance, because many people are doing and marketing loves it.
  • I tried Goolge Analtyics last Friday. It recommends this variable: click stream density over Thanksgivning weekend, on my web portal, on this item
  • Let us treat this matrix as unary so we can apply Euclidean, since that runs faster and has a lot of optimal properties. It makes our life easier
  • Let us use score from that model to boost this model and use score from this model to boost it back. Is that what they call neural nets, Jia?



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:scoring model wrong mode SCOR wrong

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-30 08:37