签到
苹果/安卓/wp
苹果/安卓/wp
客户端
0.0
0.00
推广加币
数据VIP
升级SVIP
注册
|
登录
项目交易
CDA数据分析师
CDA网校
CDA社区
CDA认证考试
CDA俱乐部
CDA Live
在线教育
JG学术培训
经管云课堂
CDA网校
CDA数据分析研究院
统计软件培训
金融科技
就学培训网
经管题库
培训证书查询
成为签约讲师
经管文库
专家入驻
学术博客
就学平台
美国在职研究生
论坛BBS
服务一览
VIP服务
数据VIP
数据库
兑换商城
广告服务
案例库
软件销售
校园代理
文献下载
会员课服务
我的
帖子
收藏
好友
我的空间
关注的人
关注的贴
找人
文库
任务
道具
勋章
网站地图
搜索
搜索
用户
人大经济论坛
›
标签
›
Builds
标签: Builds
经管大学堂:名校名师名课
相关帖子
版块
作者
回复/查看
最后发表
【大数据系列】Big Data: A Primer
数据分析与数据挖掘
wwqqer
2015-7-19
116
11783
三江鸿
2022-5-21 13:10:06
Sams Teach Yourself TCP/IP in 24 Hours, 4th Edition
winbugs及其他软件专版
Lisrelchen
2015-3-28
1
1048
lujun0337
2022-1-15 23:27:29
Bayesian Inference in the Social Sciences by Ivan Jeliazkov
- [阅读权限
16
]
管理科学与工程
tigerwolf
2014-12-5
19
484
zhaoyuanying
2020-2-3 21:38:02
【经典教材系列】Operations Strategy(4th Edition)
金融学(理论版)
wwqqer
2015-10-19
104
7176
venom1993
2018-12-31 09:21:33
Peer Pressure: Social Interaction and the Disposition Effect
- [!reward_solved!]
求助成功区
马甲甲
2016-7-27
2
948
giresse
2016-7-27 12:44:27
【国际政经系列】恐怖帝国:ISIS Empire of Fear: Inside the Islamic State
世界经济与国际贸易
wwqqer
2015-12-29
28
2751
neuroexplorer
2016-1-20 10:00:03
Sams Teach Yourself HTML and CSS in 24 Hours, 8th Edition
winbugs及其他软件专版
Nicolle
2015-5-19
3
995
Nicolle
2015-12-1 07:26:44
烈日灼心【720百度云盘】迅雷完整资源下载
休闲灌水
aide的
2015-10-31
0
4412
aide的
2015-10-31 20:52:30
【2015新书】 Applied Number Theory
管理科学与工程
kychan
2015-10-7
31
2688
djvu123
2015-10-17 02:55:47
[推荐][下载]著名教材Contemporary Strategy Analysis 第七版 高清彩色PDF
创新与战略管理
bundy
2011-10-12
62
20641
LeahSUN
2015-10-13 23:51:51
麦格劳-希尔经济学[英文第18版]McGraw-Hill Economics 18th Edition
版权审核区(不对外开放)
LRZU
2012-10-5
1
1688
wwqqer
2014-3-26 02:09:39
凯恩斯经济学的兴衰[英文版]the_Fall_and_Rise_of_Keynesian_Economics
世界经济与国际贸易
LRZU
2012-10-5
5
2430
鄢亚晴
2012-12-21 14:29:29
对性理论的三大贡献
休闲灌水
research
2011-10-30
0
1039
research
2011-10-30 18:49:09
Introductory Econometrics for Finance 2th
金融学(理论版)
musicooler
2011-10-27
3
1608
ibanker
2011-10-27 11:37:02
The Little Book That Builds Wealth
金融学(理论版)
wangzixiaozi
2009-2-13
0
2595
wangzixiaozi
2009-2-13 10:13:00
[下载]Pat Dorsey-The Little Book That Builds Wealth
金融学(理论版)
ebdl
2009-1-27
0
2784
ebdl
2009-1-27 14:32:00
更多...
相关日志
分享
转一个code,有时间测试一下
唐伯小猫
2014-4-26 05:19
library(rpart) #Popular decision tree algorithm library(rattle) #Fancy tree plot, nice graphical interface library(rpart.plot) #Enhanced tree plots library(RColorBrewer) #Color selection for fancy tree plot library(party) #Alternative decision tree algorithm library(partykit) #Convert rpart object to BinaryTree library(RWeka) #Weka decision tree J48 library(evtree) #Evolutionary Algorithm, builds the tree from the bottom up library(randomForest) library(doParallel) library(CHAID) #Chi-squared automatic interaction detection tree library(tree) library(caret) ls(package:party) #list functions in package party #Data Prep data(weather) dsname - “weather” target - “RainTomorrow” risk - “RISK_MM” ds - get(dsname) vars - colnames(ds) (ignore - vars ) vars - setdiff(vars, ignore) (inputs - setdiff(vars, target)) (nobs - nrow(ds)) dim(ds ) (form - formula(paste(target, “~ .”))) set.seed(1426) length(train - sample(nobs, 0.7*nobs)) length(test - setdiff(seq_len(nobs), train)) dim(ds) head(ds) tail(ds) summary(ds) str(ds) #——————————————————————- # Basic Scatterplot Matrix pairs(paste(“~”, paste(vars, collapse=’+'), sep=”),data=ds, main=”Simple Scatterplot Matrix”) pairs(~MinTemp+MaxTemp+Rainfall+Evaporation, data =ds, main=”Simple Scatterplot Matrix”) histogram(ds$MinTemp, breaks=20, col=”blue”) #——————————————————————- #Rpart Tree library(rpart) model - rpart(formula=form, data=ds ) model summary(model) printcp(model) #printcp for rpart objects plotcp(model) plot(model) text(model) fancyRpartPlot(model) prp(model) prp(model, type=2, extra=104, nn=TRUE, fallen.leaves=TRUE, faclen=0, varlen=0, shadow.col=”grey”, branch.lty=3) pred - predict(model, newdata=ds , type=”class”) #na.action = na.pass pred.prob - predict(model, newdata=ds , type=”prob”) #Check for na in the data, remove rows, if there are NA’s, rpart will use surrogate splits. table(is.na(ds)) ds.complete - ds (nobs - nrow(ds.complete)) set.seed(1426) length(train.complete - sample(nobs, 0.7*nobs)) length(test.complete - setdiff(seq_len(nobs), train.complete)) #Prune tree model$cptable ),”CP”] #want the first minimum model - rpart(formula=form, data=ds , cp=0) printcp(model) prune - prune(model, cp=.01) printcp(prune) #——————————————————————- #Party Tree install.packages(“partykit”, repos=”http://R-Forge.R-project.org”) library(partykit) class(model) plot(as.party(model)) #——————————————————————- #tree model - tree(formula=form, data=ds ) summary(model) #——————————————————————- #Conditional Inference Tree model - ctree(formula=form, data=ds ) model plot(model) pred - predict(model, newdata=ds ) pred.prob - predict(model, newdata=ds , type=”prob”) #Try this for class predictions: library(caret) confusionMatrix(pred, ds ) mc - table(pred, ds ) err - 1.0 – (mc + mc ) / sum(mc) #resubstitution error rate #For class probabilities: probs - treeresponse(model, newdata=test) pred - do.call(rbind, as.list(pred)) summary(pred) #For a roc curve: library(ROCR) roc - prediction(pred , ds ) #noquote(paste(“test$”, target, sep=”)) plot(performance(roc, measure=”tpr”, x.measure=”fpr”), colorize=TRUE) #For a lift curve: plot(performance(roc, measure=”lift”, x.measure=”rpp”), colorize=TRUE) #Sensitivity/specificity curve and precision/recall curve: #sensitivity(i.e True Positives/Actual Positives) and specifcity(i.e True Negatives/Actual Negatives) plot(performance(roc, measure=”sens”, x.measure=”spec”), colorize=TRUE) plot(performance(roc, measure=”prec”, x.measure=”rec”), colorize=TRUE) #Here’s an example of using 10-fold cross-validation to evaluation your model library(doParallel) registerDoParallel(cores=2) model - train(ds , ds , method=’rpart’, tuneLength=10) #cross validation #example n - nrow(ds) #nobs K - 10 #for 10 validation cross sections taille - n%/%K set.seed(5) alea - runif(n) rang - rank(alea) bloc - (rang-1)%/%taille +1 bloc - as.factor(bloc) print(summary(bloc)) all.err - numeric(0) for(k in 1:K){ model - rpart(formula=form, data = ds , method=”class”) pred - predict(model, newdata=ds , type=”class”) mc - table(ds ,pred) err - 1.0 – (mc +mc ) / sum(mc) all.err - rbind(all.err,err) } print(all.err) (err.cv - mean(all.err)) #——————————————————————- #Weka Decision Tree model - J48(formula=form, data=ds ) model predict - predict(model, newdata=ds ) predict.prob - predict(model, newdata=ds , type=”prob”) #——————————————————————- #Evolutionary Trees target - “RainTomorrow” model - evtree(formula=form, data=ds ) model plot(model) predict - predict(model, newdata=ds ) predict.prob - predict(model, newdata=ds , type=”prob”) #——————————————————————- #Random Forest from library(randomForest) table(is.na(ds)) table(is.na(ds.complete)) setnum - colnames(ds.complete) #subset(ds, select=-c(Humidity3pm, Humidity9am, Cloud9am, Cloud3pm)) ds.complete - lapply(ds.complete , function(x) as.numeric(x)) ds.complete$Humidity3pm - as.numeric(ds.complete$Humidity3pm) ds.complete$Humidity9am - as.numeric(ds.complete$Humidity9am) begTime - Sys.time() set.seed(1426) model - randomForest(formula=form, data=ds.complete ) runTime - Sys.time()-begTime runTime #Time difference of 0.3833725 secs begTime - Sys.time() set.seed(1426) model - randomForest(formula=form, data=ds.complete , ntree=500, replace = FALSE, sampsize = .632*.7*nrow(ds), na.action=na.omit) runTime - Sys.time()-begTime runTime #Time difference of 0.2392061 secs model str(model) pred - predict(model, newdata=ds.complete ) #Random Forest in parallel library(doParallel) ntree = 500 numCore = 4 rep - 125 # tree / numCore registerDoParallel(cores=numCore) begTime - Sys.time() set.seed(1426) rf - foreach(ntree=rep(rep, numCore), .combine=combine, .packages=’randomForest’) %dopar% randomForest(formula=form, data=ds.complete , ntree=ntree, mtry=6, importance=TRUE, na.action=na.roughfix, #can also use na.action = na.omit replace=FALSE) runTime - Sys.time()-begTime runTime #Time difference of 0.1990662 secs importance(model) importance(rf) pred - predict(rf, newdata=ds.complete ) confusionMatrix(pred, ds.complete ) #Random Forest from library(party) model - cforest(formula=form, data=ds.complete ) #Factor Levels id - which(!(ds$var.name %in% levels(ds$var.name))) ds$var.name - NA #——————————————————————- #Regression Trees – changing target and vars target - “RISK_MM” vars - c(inputs, target) form - formula(paste(target, “~ .”)) (model - rpart(formula=form, data=ds )) plot(model) text(model) prp(model, type=2, extra=101, nn=TRUE, fallen.leaves=TRUE, faclen=0, varlen=0, shadow.col=”grey”, branch.lty=3) rsq.rpart(model) library(Metrics) pred - predict(model, newdata=ds ) err - rmsle(ds , pred) #compare probabilities not class #——————————————————————- #Chaid Tree – new data set data(“BreastCancer”, package = “mlbench”) sapply(BreastCancer, function(x) is.factor(x)) b_chaid - chaid(Class ~ Cl.thickness + Cell.size + Cell.shape + + Marg.adhesion + Epith.c.size + Bare.nuclei + + Bl.cromatin + Normal.nucleoli + Mitoses, data = BreastCancer) plot(b_chaid) #——————————————————————- #List functions from a package ls(package:rpart) #save plots as pdf pdf(“plot.pdf”) fancyRpartPlot(model) dev.off()
0
个评论
更多...
京ICP备16021002-2号
京B2-20170662号
京公网安备 11010802022788号
论坛法律顾问:王进律师
知识产权保护声明
免责及隐私声明
GMT+8, 2024-5-1 03:21
积分 0, 距离下一级还需 积分