楼主: affiliation
8720 4

[其他] 原spss data中每个variable都有好几个missing values,变成stata后怎么处理 [推广有奖]

  • 0关注
  • 1粉丝

已卖:110份资源

讲师

32%

还不是VIP/贵宾

-

威望
1
论坛币
4798 个
通用积分
0.0750
学术水平
11 点
热心指数
6 点
信用等级
6 点
经验
1930 点
帖子
120
精华
1
在线时间
99 小时
注册时间
2010-8-14
最后登录
2022-2-25

楼主
affiliation 发表于 2011-8-5 10:57:35 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
stata的missing value一般用.
spss missing value一般是数字,9, 88, 999....

原文件格式是spss,每个variable都有几个missing value值,比如:9,0,88,999

要在stata里做分析,需要在转变成stata格式后做什么?

我想到2种方法,就是:
1、replace var1=. if var1=0 9 88 999
然后再作分析

2、之前不处理data,作分析的时候,命令后加上 if var1 !=0 9 88 999  & var2 !=0 88

第二种方法code对不对? 好几个missing value值是不是空格隔开就行?variables之间是不是用&?

这两种办法的想法对不对?

还有别的办法嘛?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Variable missing values value Stata replace values 做什么

沙发
yqm_first 发表于 2011-8-5 12:09:04
replace var1=. if var1=0 9 88 999

我想应该是replace var1=. if (var1==0)|(var1==9) |(var1==88)|(var1==999)
已有 1 人评分学术水平 热心指数 信用等级 收起 理由
affiliation + 1 + 1 + 1 热心帮助其他会员

总评分: 学术水平 + 1  热心指数 + 1  信用等级 + 1   查看全部评分

藤椅
h3327156 发表于 2011-8-5 13:16:02
无论您的第1种方法,或第二种方法,请注意|  【即""的概念,用&似乎是不对的,当然也许您有您的考量】

我个人是喜欢把spss里的数字转成stata里该有的missing value表示。
【这时应当想到mvdecode指令】

如果嫌一个变数一个变数转太慢,
则请用foreach这样的loop指令一次解决【为了变数有顺序一次来,可先order变数】
不过一次解决,要注意各变数的数字missubg value表示,譬如某个个变99代表遗漏变量,但这个99在另一个变量并非代表遗漏变量,它可能真的是代表身高99或体重99
已有 2 人评分学术水平 热心指数 信用等级 收起 理由
h894510055 + 1 + 1 + 1 学习
affiliation + 1 + 1 + 1 热心帮助其他会员

总评分: 学术水平 + 2  热心指数 + 2  信用等级 + 2   查看全部评分

板凳
蓝色 发表于 2011-8-5 14:08:13
help missing
-----------------------------------------------------------------------------------------------------

Title
    [U] 12.2.1 Missing values

Description
    Stata has 27 numeric missing values:
        ., the default, which is called the "system missing value" or sysmiss
    and
        .a, .b, .c, ..., .z, which are called the "extended missing values".
    Numeric missing values are represented by large positive values.  The ordering is
                           all nonmissing numbers < . < .a < .b < ... < .z
    Thus, the expression age > 60 is true if variable age is greater than 60 or missing.
    To exclude missing values, ask whether the value is less than ".".  For instance,
        . list if age > 60 & age < .
    To specify missing values, ask whether the value is greater than or equal to ".".  For
    instance,

        . list if age >=.
    Stata has one string missing value, which is denoted by "" (blank).

Remarks
    More details concerning missing values and their treatment in Stata are provided under the
    following headings:

        Overview
        Expressions
        Operators
        Functions
        Matrices
        Useful commands
        Value labels
        Estimation commands
        Technical note:  Version 7 and earlier


    Overview
    1.  Stata supports different types of numeric missing values that can be used to specify
        different reasons that a value is unknown.  The most frequently used missing value .,
        referred to as sysmiss, is nearly always generated by Stata when it cannot assign a
        specific value.  The 26 extended missing values .a, .b, ..., .z are available to users
        requiring more elaborate tracking of missing values.

        Empty strings are treated as missing values of type string.
    2.  Numeric missing values are represented by large positive values.  This means that an
        expression such as income > 100 evaluates to true for missing values of the variable
        income, as well as to those that are greater than 100.  Also, the simple expression if
        varname evaluates to true for all nonzero values of varname, including missing values.

    3.  The ordering of missing values is
                           all nonmissing numbers < . < .a < .b < ... < .z
    4.  Most Stata statistical commands deal with missing values by disregarding observations with
        one or more missing values (called "listwise deletion" or "complete cases only").


    Expressions
    Expressions occur in many places in Stata (see [P] syntax and exp).  For example,
        . generate newvarname = exp
    evaluates the expression exp for each observation of the variable newvarname.  Observations of
    newvarname are set to missing if exp evaluates to missing.

    Expressions are also used to restrict a command's operation to a subset of the observations.
    For instance,

        . summarize varname if exp
    summarizes varname by using all observations for which exp evaluates to true (not zero),
    including observations that are missing.


    Operators
    The relational operators (see operators) interpret missing values as large positive numbers
    (see above). All the following thus evaluate to true

                73 < .        . == .        .a == .a
                .a != .       .a < .b       .a <= .b

    whereas all the following evaluate to false
                73 >= .       . == .a       . > .a
    The numerical operators (+ etc) return missing if any of their arguments are missing.

    Functions
    Stata has a few special functions for dealing with missing values:
        missing()        returns 1 (meaning true) if any of its arguments, numeric or string,
                         evaluates to missing and 0 (meaning false) otherwise.

        mi()             is a shorthand for missing().
        matmissing(K)    returns 1 (meaning true) if any elements of the matrix K are missing and 0
                         (meaning false) otherwise.

    Some Stata functions interpret . in a special way.  For instance, the function inrange(x,a,b)
    returns 1 if x belongs in the interval [a,b].  This function interprets a==. as -infinity and
    b==. as +infinity.  These special interpretations are discussed in functions.

    Other Stata functions return missing (.) if one or more of the arguments are missing or
    invalid.


    Matrices
    Matrices may contain all types of missing values.  The matrix operators (see matrix operators)
                -     negate
                '     transpose

                \     row join
                ,     column join
                +     add
                -     subtract
                *     multiply (including multiply by scalar)
                /     division by scalar
                #     Kronecker product

    generate missing values elementwise.
    In the matrix product C=A*B, C[i,j] is missing if row i of A or column j of B contain a missing
    value.

    Matrix division by scalar C=A/b is not allowed if the scalar b is a missing value.  Otherwise,
    missing values in matrix A generate missing values in C elementwise.

    Like the list command, the matrix list command has a nodotz option to display extended missing
    value .z as a blank string rather than as ".z".


    Useful commands
    -----------------------------------------------------------------------------------------------
    mvencode            transforms missing values into numeric values
    mvdecode            transforms numeric values into missing values
    codebook            provides extensive information about variables, including the occurrence of
                          simple and extended missing values
    egen, rownonmiss()  number of valid observations in a varlist
    egen, rowmiss()     number of missing values in a varlist
    recode              recodes a variable, optionally into a new variable, with special facilities
                          to recode missing values.
    mi                  multiple imputation of missing values
    xtdescribe          describes participation patterns in panel data
    -----------------------------------------------------------------------------------------------


    Value labels
    It is possible to define value labels for the extended missing values .a to .z, but not for
    sysmiss ..  These value labels show up in the same way as value labels for nonmissing values.
    See [D] label.


    Estimation commands
    Most Stata commands ignore observations that are missing in one or more of the variables
    referred to in the command.  For instance, the regression command regress disregards all
    observations that have a missing value for the dependent variable or missing values for any of
    the independent variables.  This method is known as "listwise deletion", "complete cases only",
    etc.  It is statistically appropriate only if the missing values are "at random".  In an if or
    weight expression to a command, the expressions will be evaluated, and the missing values will
    be processed using the operators and function() logic.

    Stata commands that can treat multiple observations as being related to one observational unit
    (e.g., observations from a panel in xt models, episodes in st models) ignore specific
    observations from the "group", namely, those that have missing values.


    Technical note:  Version 7 and earlier
    Before Stata 8, Stata had only one missing value, the period (.).  Thus, you could test whether
    an expression or variable exp was missing with the expression exp==..  Starting with Stata 8,
    this method is no longer correct.  exp==. now means that the expression exp equals a specific
    missing value, namely, sysmiss ..  exp==.  returns false if exp equals one of the extended
    missing-value types such as .a or .z.  To test whether exp is missing, i.e., equals either . or
    one of the extended missing values, one should use the expression

        exp >= .
    or
        missing(exp)

    which can be abbreviated to
        mi(exp)
    To test that exp is not missing, use one of the forms
        exp < .
        !missing(exp)
        !mi(exp)

    An advantage of the last two forms is that the missing functions missing() and mi() allow
    multiple (numeric or string) arguments to test whether any of the arguments is missing.

    Old programs and do-files will continue to work using the old method, as long as the version is
    set to 7 or less.  See [P] version.


Also see
    Manual:  [U] 12.2.1 Missing values,
             [D] missing values

      Help:  [D] codebook, [D] egen, [D] functions, [MI] intro, [D] mvencode, [D] recode, [XT]
             xtdescribe,
             [U] 13 Functions and expressions (expressions),
             [U] 13 Functions and expressions (operators)


已有 1 人评分学术水平 热心指数 信用等级 收起 理由
affiliation + 1 + 1 + 1 热心帮助其他会员

总评分: 学术水平 + 1  热心指数 + 1  信用等级 + 1   查看全部评分

报纸
affiliation 发表于 2011-8-5 23:22:06
多谢各位回复!

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-3 23:14