老师,您好!
我在用R做project时遇到几个问题,请帮忙解决一下!
1. 一个数据集中有20个变量,对其中四个变量进行分析(chol, copper, trig and platelet)。What transformations could you use to make
these more bell-shaped(更偏向于正态分布)? 附件中有数据。
2. 填充缺失值。原来用的都是均值或中位数填缺,但是这里用到了另外一种方法: We will investigate missing values through a practice called “missing in the
margins”. Replace the missings with a value that is outside the range of the variable, but close enough so that when plotted, it will not look too far off (e.g. the variable log(chol) falls roughly between 4.7 and 7.5 - so you can replace the missings with a value of 3)。This plot will have a lot of overplotting in the missings. Now jitter the missing values for each of the four variables by adding noise to them (in R: you can use the jitter() function, or add random normal noise using rnorm()). Make sure the variance that you add keeps the missings separate from the rest of the data. See plot below for an example of how this might look.
有些费解,不知道如何用R去实现。
麻烦老师抽时间尽快帮我解答一下,谢谢咯 !