最近在学习Machine Learning for Hackers这本书。
在学习第四章。
我利用head截取了priority.train前六个来说明问题
> head(priority.train)
Date From.EMail Subject
1003 2002-01-31 22:44:14 robinderbains@shaw.ca please help a newbie compile mplayer :-)
1004 2002-02-01 00:53:41 lance_tt@bellsouth.net re: please help a newbie compile mplayer :-)
1005 2002-02-01 02:01:44 robinderbains@shaw.ca re: please help a newbie compile mplayer :-)
1006 2002-02-01 10:29:23 matthias@egwn.net re: please help a newbie compile mplayer :-)
1012 2002-02-01 12:42:02 bfrench@ematic.com prob. w/ install/uninstall
1014 2002-02-01 13:39:31 bfrench@ematic.com re: prob. w/ install/uninstall
有priority.train数据集三个列Date,From.EMail, Subject.
然后我想做的是在在From.Email列中找到所有包含地址的行,并统计其数量
from.weight <- ddply(head(priority.train), .(From.EMail),summarise, Freq = length(Subject))
这个代码没问题。。。
然后它出现错误是,Error: 'names' attribute [11] must be the same length as the vector [2]
但names属性就3个。怎么来[11]
下面是我查看names
> names(head(priority.train))
[1] "Date" "From.EMail" "Subject"
> length(head(priority.train))
[1] 3
这个错误一直无法解决。。。。