theoretically we can use mean and variance of marginal distribution implied by the state transition equation as the initial value. However, in many applicatons, this is infeasible due to the complicated marginal distribition. Another useful method is treating initial values as unkown parameters and estimate them jointly with parameters of interest