楼主: ReneeBK
3717 10

[问答] Multivariate Weighting: Weight Cases by Multiple Factors [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49635 个
通用积分
55.6937
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2014-4-23 01:49:23 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
My understanding is that a sample can be adjusted based on a known population using the "Weight Cases" function in SPSS. What is the proper procedure for weighting by multiple factors? Let's say I have WEIGHT_SEX and WEIGHT_ETHNICITY. Can I compute a WEIGHT_COMPOSITE by taking their product? Are there any downsides to doing this that I should be aware of?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Multivariate multivariat weighting Multiple Variate procedure function multiple adjusted factors

沙发
ReneeBK 发表于 2014-4-23 01:49:59
       
What is weight in your case? Is it frequency weight (the weight value shows how much times the row of data is counted)? SPSS command WEIGHT is this frequency weighting

藤椅
ReneeBK 发表于 2014-4-23 01:50:51
Base SPSS isn't very deft at handling sampling weights. See this website for some details on weighting (including caveats on SPSS): http://www.ats.ucla.edu/stat/spss/faq/weights.htm

Weight cases in SPSS treats each line as representative of a certain number of observed samples. (e.g. if you assign a weight of 100 to a particular line (case), then that line will be treated as 100 replicated observations of the information in that line.) This means that your sampling inferences will be based on a sample size that is too large -- and hence the calculations that follow will have too great a precision. As an example:

You might have 100 observations that you think are representative of 10,000 people in the population. The sampling properties of your statistics are driven by the 100 observations in your sample. Using weight cases would make SPSS think that your sample actually consisted of 10,000 cases, and thus statistical inference (in terms of standard errors, confidence intervals, hypothesis tests) would be wildly over-optimistic.

板凳
ReneeBK 发表于 2014-4-23 01:52:36
The proper procedure would be to first create a new categorical variable which will be the intersection of sex & ethnicity (i.e., it will have categories white-male, white-female, africanamerican-male, africanamerican-female, etc.).

Then one has to identify the weights for each of the categories (i.e., sub populations/strata) identified by the new variable.

This may or may not be the same as the products of the weight variables (most likely not).

@James Stanley makes an important point. The issue is not necessarily of SPSS, rather how the weighting is used. A way to deal with that issue is to "re-base the weight variable to the sample size". That is to assign weights such that the weighted total sample size is equal to (very close to) the unweighted sample size. This can be achieved by computing the weight for the category to be the population proportion for the category divided by the sample proportion for the category. That is, suppose there are ni observations from k sub-populations, adding to a total sample size of n. Suppose you know that the population proportion for that category is pi. Then, the weight for that category is wi=pi/(ni/n).  In computation the total weighted sample size under this approach will only be approximately equal to because of rounding issues.

Survey sample weighting is a complicated matter. There are different types of weights and how they should be handled. For example, SPSS may not be able to handle replicate weights.

报纸
ReneeBK 发表于 2014-4-23 01:53:24
If you need a multidimensional correction for representativeness, you might want to use the SPSSINC RAKE extension command. It computes weights matching specified marginals in up to 10 dimensions. You can get it from the SPSS Community website at www.ibm.com/developerworks/spssdevcentral. It requires the Python Essentials, which are also available via that site and the Advanced Statistics module. The latter is needed because RAKE uses GENLOG to fit a loglinear model as part of the process.

If you have a complex samples design, though, you should use the procedures in the Complex Samples option.

地板
ReneeBK 发表于 2014-4-23 01:54:37

You can use SPSS macro for multivariate weighting. Download macro from this page: SPSS multivariate weighting

You have to define weight parameters (number of parameters is unlimited) and the name of weighting variable only. Everything is calculated itself.

7
ReneeBK 发表于 2014-4-23 02:00:26
DEFINE !mac_weight (w_vars = !ENCLOSE('[',']') / w_var = !ENCLOSE('[',']') / N = !ENCLOSE('[',']')  / w_control = !ENCLOSE('[',']') )

  weight off.
  compute !w_var = 1.
  variable label !w_var 'Weight'.
  aggregate
    /outfile=* mode=addvariables overwritevars=yes
    /@total 'Universum' = sum(!w_var).
  compute @real = 1.
  compute @q_real = 100 * @real / @total.
  execute.

  !IF (!N~=!NULL) !THEN
     compute @total = !N.
  !IFEND

  !LET !w_num_param = !NULL
  !DO !w_pom !IN (!w_vars)
    !LET !w_num_param = !CONCAT(!w_num_param,"1")
  !DOEND

  !LET !hlp_vars = !w_vars
  !DO !w_hlp = 1 !TO !UNQUOTE(!LENGTH(!w_num_param)) !BY 3
    !LET !hlp_var    = !HEAD(!hlp_vars)
    !LET !hlp_vars   = !TAIL(!hlp_vars)
    !LET !hlp_code   = !HEAD(!hlp_vars)
    !LET !hlp_vars   = !TAIL(!hlp_vars)
    !LET !hlp_weight = !HEAD(!hlp_vars)
    !LET !hlp_vars   = !TAIL(!hlp_vars)
    recode !hlp_var (!hlp_code = !hlp_weight) into !CONCAT("@w",!hlp_var).
  !DOEND
  execute.

  !LET !hlp_vars = !w_vars
  !LET !w_hlp_crit  = !NULL
  !DO !w_hlp = 1 !TO !UNQUOTE(!LENGTH(!w_num_param)) !BY 3
    !LET !hlp_var  = !HEAD(!hlp_vars)
    !LET !hlp_vars = !TAIL(!TAIL(!TAIL(!hlp_vars)))
    !IF (!w_hlp_crit ~= !hlp_var) !THEN
      !LET !w_hlp_crit = !hlp_var
      compute !CONCAT("@ww",!w_hlp_crit) = !CONCAT("@w",!w_hlp_crit).
      compute !CONCAT("@www",!w_hlp_crit) = !CONCAT("@w",!w_hlp_crit).
      sort cases by !w_hlp_crit.
      if (!w_hlp_crit= lag(!w_hlp_crit)) !CONCAT("@ww",!w_hlp_crit) = lag(!CONCAT("@ww",!w_hlp_crit)).
      if (!w_hlp_crit= lag(!w_hlp_crit)) !CONCAT("@www",!w_hlp_crit) = lag(!CONCAT("@www",!w_hlp_crit)).
      if (!w_hlp_crit<>lag(!w_hlp_crit)) !CONCAT("@ww",!w_hlp_crit) = sum(lag(!CONCAT("@ww",!w_hlp_crit)),!CONCAT("@w",!w_hlp_crit)).
      if (!w_hlp_crit<>lag(!w_hlp_crit)) !CONCAT("@www",!w_hlp_crit) = sum(lag(!CONCAT("@www",!w_hlp_crit)),!CONCAT("@w",!w_hlp_crit)).
      aggregate
        /outfile=* mode=addvariables overwritevars=yes
        /!CONCAT("@ww",!w_hlp_crit) = max(!CONCAT("@ww",!w_hlp_crit))
        /!CONCAT("@www",!w_hlp_crit) = max(!CONCAT("@www",!w_hlp_crit)).
      compute !CONCAT("@ww",!w_hlp_crit) = @total * !CONCAT("@w",!w_hlp_crit) / !CONCAT("@ww",!w_hlp_crit).
      compute !CONCAT("@www",!w_hlp_crit) = !CONCAT("@w",!w_hlp_crit) / !CONCAT("@www",!w_hlp_crit).
      aggregate
        /outfile=* mode=addvariables overwritevars=yes
        /break = !w_hlp_crit
        /@number_of_resp = sum(!w_var).
      compute !CONCAT("@q_ww",!w_hlp_crit) = !CONCAT("@ww",!w_hlp_crit) / @number_of_resp.
      compute !CONCAT("@q_www",!w_hlp_crit) = 100 * !CONCAT("@www",!w_hlp_crit) / @number_of_resp.
    !IFEND
  !DOEND
  execute.

  !LET !w_hlp_crit  = !NULL
  !DO !w_hlp1 = 1 !TO !UNQUOTE(!LENGTH(!w_num_param)) !BY 3
    !LET !hlp_vars = !w_vars
    !DO !w_hlp2 = 1 !TO !UNQUOTE(!LENGTH(!w_num_param)) !BY 3
      !LET !hlp_var  = !HEAD(!hlp_vars)
      !LET !hlp_vars = !TAIL(!TAIL(!TAIL(!hlp_vars)))
      !IF (!w_hlp_crit ~= !hlp_var) !THEN
        !LET !w_hlp_crit = !hlp_var
        aggregate /outfile=* mode=addvariables overwritevars=yes
          /break = !w_hlp_crit
          /@number_of_resp = sum(!w_var).
        compute !w_var = !w_var * (!CONCAT("@ww",!w_hlp_crit) / @number_of_resp).
        execute.
      !IFEND
    !DOEND
  !DOEND

  !IF (!UPCASE(!w_control)="YES") !THEN
    compute @total = !N.
    !LET !hlp_vars = !w_vars
    !LET !w_hlp_crit  = !NULL
    !DO !w_hlp = 1 !TO !UNQUOTE(!LENGTH(!w_num_param)) !BY 3
      !LET !hlp_var  = !HEAD(!hlp_vars)
      !LET !hlp_vars = !TAIL(!TAIL(!TAIL(!hlp_vars)))
      !IF (!w_hlp_crit ~= !hlp_var) !THEN
        !LET !w_hlp_crit = !hlp_var
        compute !CONCAT("@q_",!w_var) = 100 * !w_var / @total.
        ctables
          /vlabels variables=@q_real !CONCAT("@q_www",!w_hlp_crit) !CONCAT("@q_",!w_var) @real !CONCAT("@q_ww",!w_hlp_crit) !w_var display=none
          /vlabels variables=!w_hlp_crit display=label
          /table
               @q_real [s][sum '% (Real)' F40.2] +
               !CONCAT("@q_www",!w_hlp_crit) [s][sum '% (Quote)' F40.2] +
               !CONCAT("@q_",!w_var) [s][sum '% (Weighted)' F40.2] +
               @real [s][sum 'Number (Real)' F40.2] +
               !CONCAT("@q_ww",!w_hlp_crit) [s][sum 'Number (Quote)' F40.2] +
               !w_var [s][sum 'Number (Weighted)' F40.2] +
               !w_var [s][minimum 'Min Weight' F40.2] +
               !w_var [s][maximum 'Max Weight' F40.2]
               by !w_hlp_crit [c]
          /titles title = !QUOTE(!CONCAT("Checking of Weights - ",!w_hlp_crit))
          /slabels position=row
          /categories variables=!w_hlp_crit order=a key=value empty=include total=yes position=before.
      !IFEND
    !DOEND
  !ELSE
    descriptives weight /statistics = min max.
  !IFEND

  !LET !w_hlp_crit  = !NULL
  !LET !hlp_vars = !w_vars
  !DO !w_hlp = 1 !TO !UNQUOTE(!LENGTH(!w_num_param)) !BY 3
    !LET !hlp_var  = !HEAD(!hlp_vars)
    !LET !hlp_vars = !TAIL(!TAIL(!TAIL(!hlp_vars)))
    !IF (!w_hlp_crit ~= !hlp_var) !THEN
      !LET !w_hlp_crit = !hlp_var
      delete variables !CONCAT("@w",!w_hlp_crit) !CONCAT("@ww",!w_hlp_crit) !CONCAT("@www",!w_hlp_crit) !CONCAT("@q_ww",!w_hlp_crit) !CONCAT("@q_www",!w_hlp_crit).
    !IFEND
  !DOEND
  delete variables @total @real @q_real @number_of_resp !CONCAT("@q_",!w_var).

  weight by !w_var.

!ENDDEFINE.

/* example of macro call.
!mac_weight
N [1000]
w_control [yes]
w_var [weight]
w_vars [
  q1 1 0.5
  q1 2 0.5
  q2 1 0.25
  q2 2 0.4
  q2 3 0.35
]
.

8
ReneeBK 发表于 2014-4-23 02:01:07

SPSS – Multivariate weighting


Task:

  • We want SPSS data to be weighted according to several independent criteria.
  • e.g. gender, age, region, size of residence, economic activity etc.

Polemic:

  • There is no built tool for this way of weighting directly in SPSS.
  • We aren’t able to use or build our own external weighting application. This kind of application usually needs a special data file with parameters for weighting and after weighting it creates another data file with weights to be imported back to SPSS. User comfort there is not too high.
  • This fact annoyed us and we were looking for an easier solutions.

Solution using SPSS macro:


9
ReneeBK 发表于 2014-4-23 02:34:51

10
Nicolle 学生认证  发表于 2014-4-23 02:52:00
提示: 作者被禁止或删除 内容自动屏蔽

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-30 23:40