OK,我们来玩一个游戏,用简单的记事本程序创建如下内容的txt文档,其完整名称为abc.txt:
"haha","","","foo"
"x1","x2","x3",""
"1","2","3",""
"4","5","6",""
"7","8","9",""
现在,需要剔除无用的行次(比如空字符较多的行次),并剔除无用的列!当然,还要转换类型,使R在读取文件后,能生成一个数值矩阵(或数据框),而非字符串组成的矩阵(或数据框)。操作如下:
- > a = read.csv("abc.txt",header=FALSE)
- > a
- V1 V2 V3 V4
- 1 haha foo
- 2 x1 x2 x3
- 3 1 2 3
- 4 4 5 6
- 5 7 8 9
- > notNULL.row = apply(a, 1, function(x) sum(x != ""))
- > a = a[notNULL.row == max(notNULL.row), ]
- > notNULL.col = apply(a, 2, function(x) sum(x != ""))
- > a = a[, notNULL.col == max(notNULL.col)]
- > b = as.matrix(a[-1, ])
- > colnames(b) = unlist(a[1, ])
- > mode(b) = "numeric"
- > b
- x1 x2 x3
- 3 1 2 3
- 4 4 5 6
- 5 7 8 9
复制代码R的数据操作非常灵活,这里仅是个简单样例,各位还需针对实际情况写脚本。