[问答] 关于randomForest [推广有奖]

0关注
0粉丝

初中生

42%

还不是VIP/贵宾

威望: 0 级
论坛币: 2696 个
通用积分: 0
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 113 点
帖子: 3
精华: 0
在线时间: 24 小时
注册时间: 2011-3-1
最后登录: 2019-11-11

楼主

zhzhw91 发表于 2015-11-25 23:03:13 |只看作者 |坛友微信交流群|倒序 |AI写论文

100论坛币

使用randomForest（）函数处理数据后，直接调用plot（）函数，得到图形如下，请问该图各条线如何解释，怎么添加图例？

屏幕快照 2015-11-25 下午10.49.43.png (319.64 KB)

屏幕快照 2015-11-25 下午10.49.43.png

最佳答案

neuroexplorer 查看完整内容

分享0 收藏2 回帖

关键词：randomForest Forest random Rest rand 如何

使用道具举报

沙发

neuroexplorer 发表于 2015-11-25 23:03:14 |只看作者 |坛友微信交流群

From what you showed, there is no test result (namely ytest was empty for training):

Black solid line is for overall OOB (out-of-bag) error and, colour lines, one for each class' error (i.e. 1-this class recall).

Suppose you use IRIS data, then:The red curve is the error rate for the Setosa class, the green and blue curves above are for Versicolor and Virginica while the black curve is the Out-of-Bag error rate.

The code is as following:

plot(fit)
legend(1500, 0.15, c('line 1', 'line 2', 'line 3', 'line 4'),
   lty=c(1,1,1,1),
   lwd=c(2.5,2.5, 2.5,2.5),
   col=c('black', "blue","red", 'green'))

example.png (25.22 KB)

使用道具举报

藤椅

jgchen1966 发表于 2015-11-26 15:12:26 |只看作者 |坛友微信交流群

rfMt<-randomForest(x=xda[Idx,],y=yv[Idx],xtest=xda[-Idx,],ytest=yv[-Idx],ntree=1500）
用str() 显示模型结果：
str(rfMt)
List of 17
$ call          : language randomForest(x = xda[Idx, ], y = yv[Idx], xtest = xda[-Idx, ], ytest = yv[-Idx], ntree = 1500,    corr.bias = TRUE)
$ type          : chr "regression"
$ predicted    : Named num [1:120] 1.22 2.06 1.58 1.91 1.45 ...
  ..- attr(*, "names")= chr [1:120] "69" "303" "22" "13" ...
$ mse          : num [1:1500] 0.394 0.349 0.322 0.311 0.316 ...
$ rsq          : num [1:1500]

................................
........................
$ y             : num [1:120] 1.79 2.08 1.39 1.61 1.1 ...
$ test          :List of 4
  ..$ predicted: Named num [1:83] 1.78 1.71 1.69 1.74 1.94 ...
  .. ..- attr(*, "names")= chr [1:83] "5" "9" "12" "14" ...
  ..$ mse    : num [1:1500] 0.392 0.292 0.303 0.316 0.305 ...
  ..$ rsq    : num [1:1500] 0.345 0.512 0.493 0.471 0.49 ...
  ..$ proximity: NULL
$ inbag

注：红色部分即为plot(rfMat)输出部分，但不完美，因此，可自已编程绘制：
  yda<-data.frame(oobmse=rfMt$mse,testmse=rfMt$test$mse)
> library(tidyr)
> yda$id<-1:1500
> str(yda)
'data.frame': 1500 obs. of  3 variables:
$ oobmse : num  0.394 0.349 0.322 0.311 0.316 ...
$ testmse: num  0.392 0.292 0.303 0.316 0.305 ...
$ id    : int  1 2 3 4 5 6 7 8 9 10 ...
> yda<-gather(yda,key=Type,value=mse,-id)
> str(yda)
'data.frame': 3000 obs. of  3 variables:
$ id  : int  1 2 3 4 5 6 7 8 9 10 ...
$ Type: Factor w/ 2 levels "oobmse","testmse": 1 1 1 1 1 1 1 1 1 1 ...
$ mse : num  0.394 0.349 0.322 0.311 0.316 ...
> library(ggplot2)
> ggplot(yda,aes(x=id,y=mse,colour=Type))+geom_line()
当然按自已的要求进一步美化！！！

当然mse 仅仅说明ntree 指标设置是否合理。test集的mse 则是一个预测性好坏的指标。
mse : (regression only) vector of mean square errors: sum of squared residuals divided  by n.

鹑居鷇食，鸟行无彰

使用道具举报