本人是R的初学者,想请教几个问题:
现在有1992年至2007年的中国人均GDP的数据,目的是要比较 linear model, log linear model, exponeantial model这三个模型,对2008年的人均GDP做forecast,假设并不知道2008年及以后的数据。
问题一:如何以1992至2005年做in sample,以2005至2007做out of sample,比较实际值和预测值,以选择模型呢?关键是如何写R 命令。目前我写到下面:
# ---- Linear Trend Model -----
y <- read.csv(file="data.csv",head=TRUE,sep=",")#input the data with head
print(y)
tsy <- ts(y, start=1992, frequency=1)
plot.ts(tsy,type="l",col="blue",ylab="gdp per capita",xlab="year") #plot the time series
is.ts(tsy)# test whether it is time series
t=1992:2007
lm1<-lm(tsy~t,na.action=NULL)# linear trend model
#-- the na.action statement is to retain time series attributes
summary(lm1)
model.matrix(lm1) #You can view the model matrix (with the dummy variables) this way:
yhat=fitted(lm1) #fitted value of the model
e=residuals(lm1) #forecast error of the model
tsyhat<-ts(fitted(lm1),start=1992, frequency=1)# make the yhat time series
tse<-ts(e,start=1992,frequency=1) # make the forecast error time series
ts.plot(tsy,type="p",lwd=2,col="red",ylab="GDP per capita",xlab="Year")
lines(tsyhat)
ts.plot(tse,type="l",lwd=2,col="blue",ylab="Forecast error",xlab="Year")
abline(h=0,col="red")
##do the out of sample prediction
years<-c(2007,2008,2009)
b<-predict(lm1,data.frame(x=2008),se.fit = TRUE, scale = NULL, df = Inf,
interval = c("none", "confidence", "prediction"),
level = 0.95, type = c("response", "terms"),
terms = NULL, na.action = na.pass,
pred.var = res.var/weights, weights = 1)
但是这样predict出来的东西都是1992至2007的,何做2008年的forecast呢?
问题二:如何画prediction的interval到plot上呢,这个一点都不会了。
十万火急啊,下周就要交report了,坐等高手指教,不胜感激!!!!


雷达卡






京公网安备 11010802022788号







