楼主: zaizaibob
18660 14

[一般统计问题] stata中如何删除两个变量同时重复的值? [推广有奖]

11
helen1wendy 学生认证  发表于 2018-3-4 12:22:39
duplicates drop不具有普遍性,不适用于有多个变量,只想去掉其中两个变量取值相同的观测值。

12
兵兵是兵兵啊 发表于 2018-6-7 21:30:17
helen1wendy 发表于 2018-3-4 12:22
duplicates drop不具有普遍性,不适用于有多个变量,只想去掉其中两个变量取值相同的观测值。
请问下,该怎么做呢?

13
seaswallowxue 在职认证  发表于 2018-8-12 11:51:40
ywh19860616 发表于 2013-8-17 10:50
cond函数作用:
cond(x,a,b,c) or cond(x,a,b)
       Domain x:     -8e+307 to 8e+307 and missing;  ...
您的这个命令帮了我大忙,谢谢您!

14
jimy1 发表于 2023-5-23 15:53:28
[quote]ywh19860616 发表于 2013-8-17 10:50 资料来源:
https://www.stata.com/support/faqs/data-management/duplicate-observations/
Case 1: Identifying duplicates based on a subset of variables
You wish to create a new variable named dup

         dup = 0       record is unique
         dup = 1       record is duplicate, first occurrence
         dup = 2       record is duplicate, second occurrence
         dup = 3       record is duplicate, third occurrence
         etc.
and to base the determination on the variables name, age, and sex.

        . sort name age sex
        . quietly by name age sex:  gen dup = cond(_N==1,0,_n)
Note the capitalization of _N and _n. (Stata interprets _N to mean the total number of observations in the by-group and _n to be the observation number within the by-group.)

Having created the new variable dup, you could then

        . tabulate dup
to see a report of the duplicate count.

To base the duplicate count solely on name, type

        . sort name
        . quietly by name:  gen dup = cond(_N==1,0,_n)



15
转身的虾米 在职认证  学生认证  发表于 2023-8-31 22:55:47
villainshine 发表于 2013-8-17 08:59
duplicates drop stkcd year,force
这个命令会报错。抄袭2008年的连享会的答案也多少动点脑子操作一下吧?

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-29 06:31