Cluster sampling整群抽样和Stratified random sampling分层抽样的区别
Cluster sampling整群抽样和Stratified random sampling分层抽样典型区别在于:在整群抽样Cluster sampling中,只有选定的cluster里面的个体才有机会成为样本a whole cluster is regarded as a sampling unit and only sampled clusters are included,没有被选中的cluster是不会有机会成为样本的。而在Stratified random sampling分层抽样里,所有的层都是抽样的范围,每层里只有一些个体会成为样本all the strata are included and only specific elements within each stratum are then selected as sampling units。
整群抽样Cluster sampling,我们首先将总体分成一块块divided into clusters,每一块叫一个cluster,每个cluster都是总体的缩影mini-representation of the entire populations。然后每个特定的cluster都按照简单随机抽样simple random sampling进行抽取。
如果被选中cluster的的所有个体都被抽取了 all the members in each sampled cluster are sampled,这种抽样方法叫做单阶段整群抽样one-stage cluster sampling;如果只是从选定的Cluster里选取部分个体a subsample is randomly selected from each selected cluster,这种抽样方法叫两阶段整群抽样two-stage cluster sampling。
整群抽样的优点有Cluster sampling Advantages:对于总体数据量巨大的数据来说,整群抽样节约时间,成本低廉the most time-efficient and cost-efficient probability sampling plan for analyzing a vast population。缺点是Disadavantages,相较于其他抽样方法来说,整群抽样准确性较低lower accuracy because a sample from a cluster might be less representative of the entire population,因为这种方法是从大致代表总体的cluster又进行的抽样,就算cluster是总体的缩影,然而缩影相对于原总体还是丢失了部分数据,降低了准确性。准确性和成本,很难两者兼顾。选择准确性,成本必然高;节约成本,准确性必然低。微信公众号 金融分析师1级到3级讲解
下面我讲复杂一些的分层抽样Stratified random sampling。例如,我们从A、B、C、D四个工厂抽取样本,每家工厂抽100件,一共抽取4*100=400件,4代表的是层数,100代表的是每层抽取的样本数。这里,我们首先需要将总体分成m组,又叫m层,再从每层里取n个样本,总样本量Total sample=m*n
分层抽样Stratified random sampling的优点Adavantages就是能保证能抽取到感兴趣的样本Guarantee that population subdivisions of interest are included in the sample。而且相对于简单随机抽样,同等抽样量下,分层抽样的样本方差更小smaller variance or dispersion,样本更准确Greater precision,更能代表总体。