英文文献:Nonparametric Density Estimation for Stratified Samples-分层样本的非参数密度估计
英文文献作者:Robert Breunig
英文文献摘要:
In this paper, we consider the non-parametric, kernel estimate of the density, f(x), for data drawn from stratified samples. Much of the data used by social scientists is gathered in some type of complex survey violating the usual assumptions of independently and identically distributed data. Such effects induced by the survey structure are rarely considered in the literature on non-parametric density estimation, yet they may have serious consequences for our analysis, as shown in this paper. A weighted estimator is developed which provides asymptotically unbiased density estimation for stratified samples. A data-based method for choosing the optimal bandwidth is suggested, using information on withinstratum variances and means. The weighted estimator and proposed bandwidth are shown to give smaller mean squared error for stratified samples than an un-weighted estimator and a commonly used method of choosing the bandwidth. Surprisingly, the single bandwidth outperforms optimally choosing stratum-specific bandwidths in some cases. Several illustrations from simulation are provided. We also show that the optimal sampling scheme in this case is always stratified sampling proportional to size, irrespective of the stratum-specific densities
在本文中,我们考虑非参数的,密度的核估计,f(x),从分层样本。社会科学家使用的许多数据是在某种复杂的调查中收集的,违反了通常假设的独立和相同分布的数据。这些由调查结构引起的影响在非参数密度估计的文献中很少被考虑,但它们可能会给我们的分析带来严重的后果,如本文所示。提出了一种提供分层样本密度渐近无偏估计的加权估计方法。利用层内方差和平均值的信息,提出了一种基于数据的选择最佳带宽的方法。加权估计和提出的带宽比非加权估计能得到更小的平均平方误差,是选择带宽的常用方法。令人惊讶的是,在某些情况下,单带宽性能优于最佳选择的特定层带宽。给出了仿真的几个实例。我们还表明,在这种情况下,最佳的抽样方案总是分层抽样大小成比例,而与层比密度无关


雷达卡


京公网安备 11010802022788号







