- woe_iv = function(bin, events, nonevents) {
- total_events = sum(events)
- total_nonevents = sum(nonevents)
- event_dist = events / total_events
- nonevent_dist = nonevents / total_nonevents
- woe = log(nonevent_dist / event_dist)
- iv = sum((nonevent_dist - event_dist) * woe)
- list(woe = data.frame(bin=bin, woe=woe), iv = iv)
- }
调用例子
- bin = c("21-30", "30-36", "36-48", "48-60")
- events = c(206, 357, 776, 183)
- nonevents = c(4615, 9909, 32150, 12605)
- woe_iv(bin, events, nonevents)
结果:
- $woe
- bin woe
- 1 21-30 -0.55303887
- 2 30-36 -0.33876692
- 3 36-48 0.06178536
- 4 48-60 0.57013283
- $iv
- [1] 0.1093199