楼主: 奇犽dsp
8619 18

[数据管理求助] 使用duplicates删除重复数据后仍存在重复数据 [推广有奖]

11
奇犽dsp 学生认证  发表于 2018-1-4 10:45:11
黃河泉 发表于 2018-1-4 10:36
尔后建议请用 dataex (先 ssc install dataex 并见说明) 将原始 Stata 资料中具有”代表性”的一部分资料 ...
  1. [CODE]
  2. * Example generated by -dataex-. To install: ssc install dataex
  3. clear
  4. input str9 ticker double time
  5. "000937"  1813051799999.997
  6. "300107" 1813051800000.0002
  7. "002616" 1813051859999.9998
  8. "600578" 1813051860000.0017
  9. "000937"  1813051919999.998
  10. "600340"      1813051920000
  11. "600221" 1813051980000.0002
  12. "600787" 1813051980000.0027
  13. "000856"  1813052039999.999
  14. "300446" 1813052039999.9998
  15. "300055"      1.8130521e+12
  16. "000937"  1813052100000.004
  17. "600266" 1813052160000.0002
  18. "000923" 1813052219999.9958
  19. "002542" 1813052219999.9998
  20. "600008"      1813052280000
  21. "000923" 1813052280000.0012
  22. "600722" 1813052339999.9968
  23. "002691"      1813052340000
  24. "600266" 1813052400000.0002
  25. "600787" 1813052400000.0022
  26. "600739" 1813052459999.9978
  27. "000616" 1813052459999.9998
  28. "000158"      1813052520000
  29. "000852" 1813052520000.0032
  30. "000897"  1813052579999.999
  31. "000616" 1813052580000.0002
  32. "000401" 1813052639999.9998
  33. "000897" 1813052640000.0037
  34. "000605"      1.8130527e+12
  35. "600376" 1813052759999.9963
  36. "000605" 1813052760000.0002
  37. "600221" 1813052819999.9998
  38. "600533"  1813052820000.001
  39. "600787" 1813052879999.9968
  40. "601000"      1813052880000
  41. "601000" 1813052940000.0002
  42. "000937" 1813052940000.0022
  43. "000786" 1813052999999.9978
  44. "000615" 1813052999999.9998
  45. "002158"      1813053060000
  46. "600616" 1813053060000.0032
  47. "000965" 1813053119999.9988
  48. "600028"      1813053120000
  49. "300446" 1813053180000.0002
  50. "000856" 1813053180000.0042
  51. "000605" 1813053239999.9998
  52. "000965"  1813053299999.996
  53. "002494"      1.8130533e+12
  54. "300055" 1813053360000.0002
  55. "000958"  1813053360000.001
  56. "000965" 1813053419999.9973
  57. "300055" 1813053419999.9998
  58. "002457"      1813053480000
  59. "000958"  1813053480000.002
  60. "000958" 1813053539999.9983
  61. "300070" 1813053540000.0002
  62. "603969" 1813053599999.9998
  63. "000916"  1813053600000.003
  64. "000937" 1813053659999.9988
  65. "603616"      1813053660000
  66. "002665" 1813053720000.0002
  67. "000897" 1813053720000.0042
  68. "000897" 1813053779999.9998
  69. "000852"  1813053839999.996
  70. "300344"      1813053840000
  71. "002665" 1813053900000.0002
  72. "000852" 1813053900000.0007
  73. "600616"  1813053959999.997
  74. "000616" 1813053959999.9998
  75. "603903"      1813054020000
  76. "600376"  1813054020000.002
  77. "000897"  1813054079999.998
  78. "002310"      1813054080000
  79. "603903" 1813054140000.0002
  80. "000897"  1813054140000.003
  81. "600550" 1813054199999.9993
  82. "002542" 1813054199999.9998
  83. "000415"      1813054260000
  84. "600616"  1813054260000.004
  85. "603569" 1813054320000.0002
  86. "000937" 1813054379999.9958
  87. "002717" 1813054379999.9998
  88. "000616"      1813054440000
  89. "000856" 1813054440000.0012
  90. "000923"  1813054499999.997
  91. "300117" 1813054500000.0002
  92. "300428" 1813054559999.9998
  93. "000958" 1813054560000.0017
  94. "000897"  1813054619999.998
  95. "600266"      1813054620000
  96. "300048" 1813054680000.0002
  97. "000856" 1813054680000.0027
  98. "000897"  1813054739999.999
  99. "002158" 1813054739999.9998
  100. "002457"      1.8130548e+12
  101. "000959"  1813054800000.004
  102. "000959" 1813054860000.0002
  103. "600722" 1813054919999.9958
  104. "601633" 1813054919999.9998
  105. end
  106. format %tc time
复制代码

[/code]
【版规】“文献求助专区”版规(试行版) :https://bbs.pinggu.org/thread-4820326-1-1.html
【回帖奖励3论坛币】https://bbs.pinggu.org/forum.php?mod=viewthread&tid=6274240&page=1&extra=#pid49869790

12
奇犽dsp 学生认证  发表于 2018-1-4 10:51:27
黃河泉 发表于 2018-1-4 10:36
尔后建议请用 dataex (先 ssc install dataex 并见说明) 将原始 Stata 资料中具有”代表性”的一部分资料 ...
  1. [CODE]
  2. * Example generated by -dataex-. To install: ssc install dataex
  3. clear
  4. input str9 ticker double time
  5. "000937"  1813051799999.997
  6. "300107" 1813051800000.0002
  7. "002616" 1813051859999.9998
  8. "600578" 1813051860000.0017
  9. "000937"  1813051919999.998
  10. "600340"      1813051920000
  11. "600221" 1813051980000.0002
  12. "600787" 1813051980000.0027
  13. "000856"  1813052039999.999
  14. "300446" 1813052039999.9998
  15. "300055"      1.8130521e+12
  16. "000937"  1813052100000.004
  17. "600266" 1813052160000.0002
  18. "000923" 1813052219999.9958
  19. "002542" 1813052219999.9998
  20. "600008"      1813052280000
  21. "000923" 1813052280000.0012
  22. "600722" 1813052339999.9968
  23. "002691"      1813052340000
  24. "600266" 1813052400000.0002
  25. "600787" 1813052400000.0022
  26. "600739" 1813052459999.9978
  27. "000616" 1813052459999.9998
  28. "000158"      1813052520000
  29. "000852" 1813052520000.0032
  30. "000897"  1813052579999.999
  31. "000616" 1813052580000.0002
  32. "000401" 1813052639999.9998
  33. "000897" 1813052640000.0037
  34. "000605"      1.8130527e+12
  35. "600376" 1813052759999.9963
  36. "000605" 1813052760000.0002
  37. "600221" 1813052819999.9998
  38. "600533"  1813052820000.001
  39. "600787" 1813052879999.9968
  40. "601000"      1813052880000
  41. "601000" 1813052940000.0002
  42. "000937" 1813052940000.0022
  43. "000786" 1813052999999.9978
  44. "000615" 1813052999999.9998
  45. "002158"      1813053060000
  46. "600616" 1813053060000.0032
  47. "000965" 1813053119999.9988
  48. "600028"      1813053120000
  49. "300446" 1813053180000.0002
  50. "000856" 1813053180000.0042
  51. "000605" 1813053239999.9998
  52. "000965"  1813053299999.996
  53. "002494"      1.8130533e+12
  54. "300055" 1813053360000.0002
  55. "000958"  1813053360000.001
  56. "000965" 1813053419999.9973
  57. "300055" 1813053419999.9998
  58. "002457"      1813053480000
  59. "000958"  1813053480000.002
  60. "000958" 1813053539999.9983
  61. "300070" 1813053540000.0002
  62. "603969" 1813053599999.9998
  63. "000916"  1813053600000.003
  64. "000937" 1813053659999.9988
  65. "603616"      1813053660000
  66. "002665" 1813053720000.0002
  67. "000897" 1813053720000.0042
  68. "000897" 1813053779999.9998
  69. "000852"  1813053839999.996
  70. "300344"      1813053840000
  71. "002665" 1813053900000.0002
  72. "000852" 1813053900000.0007
  73. "600616"  1813053959999.997
  74. "000616" 1813053959999.9998
  75. "603903"      1813054020000
  76. "600376"  1813054020000.002
  77. "000897"  1813054079999.998
  78. "002310"      1813054080000
  79. "603903" 1813054140000.0002
  80. "000897"  1813054140000.003
  81. "600550" 1813054199999.9993
  82. "002542" 1813054199999.9998
  83. "000415"      1813054260000
  84. "600616"  1813054260000.004
  85. "603569" 1813054320000.0002
  86. "000937" 1813054379999.9958
  87. "002717" 1813054379999.9998
  88. "000616"      1813054440000
  89. "000856" 1813054440000.0012
  90. "000923"  1813054499999.997
  91. "300117" 1813054500000.0002
  92. "300428" 1813054559999.9998
  93. "000958" 1813054560000.0017
  94. "000897"  1813054619999.998
  95. "600266"      1813054620000
  96. "300048" 1813054680000.0002
  97. "000856" 1813054680000.0027
  98. "000897"  1813054739999.999
  99. "002158" 1813054739999.9998
  100. "002457"      1.8130548e+12
  101. "000959"  1813054800000.004
  102. "000959" 1813054860000.0002
  103. "600722" 1813054919999.9958
  104. "601633" 1813054919999.9998
  105. end
  106. format %tc time
复制代码
[/code]

13
黃河泉 在职认证  发表于 2018-1-4 10:53:57
你确定知道
  1. duplicates drop time, force
复制代码
你在做什么吗?

14
奇犽dsp 学生认证  发表于 2018-1-4 10:55:09
黃河泉 发表于 2018-1-4 10:53
你确定知道你在做什么吗?
老师,我之前浏览到“duplicates”是用于删除重复数据的。请问这是正确的吗?

15
奇犽dsp 学生认证  发表于 2018-1-4 10:56:36
黃河泉 发表于 2018-1-4 10:54
我试过
老师,是这样的,我之前给您的代码确实没有重复值,随后我又重新回复了您一份代码。可是论坛提示我回复需要审核,所以可能没有回复成功,我再给您发一份代码您再试一试?

16
奇犽dsp 学生认证  发表于 2018-1-4 10:56:52
黃河泉 发表于 2018-1-4 10:54
我试过
  1. [CODE]
  2. * Example generated by -dataex-. To install: ssc install dataex
  3. clear
  4. input str9 ticker double time
  5. "000937"  1813051799999.997
  6. "300107" 1813051800000.0002
  7. "002616" 1813051859999.9998
  8. "600578" 1813051860000.0017
  9. "000937"  1813051919999.998
  10. "600340"      1813051920000
  11. "600221" 1813051980000.0002
  12. "600787" 1813051980000.0027
  13. "000856"  1813052039999.999
  14. "300446" 1813052039999.9998
  15. "300055"      1.8130521e+12
  16. "000937"  1813052100000.004
  17. "600266" 1813052160000.0002
  18. "000923" 1813052219999.9958
  19. "002542" 1813052219999.9998
  20. "600008"      1813052280000
  21. "000923" 1813052280000.0012
  22. "600722" 1813052339999.9968
  23. "002691"      1813052340000
  24. "600266" 1813052400000.0002
  25. "600787" 1813052400000.0022
  26. "600739" 1813052459999.9978
  27. "000616" 1813052459999.9998
  28. "000158"      1813052520000
  29. "000852" 1813052520000.0032
  30. "000897"  1813052579999.999
  31. "000616" 1813052580000.0002
  32. "000401" 1813052639999.9998
  33. "000897" 1813052640000.0037
  34. "000605"      1.8130527e+12
  35. "600376" 1813052759999.9963
  36. "000605" 1813052760000.0002
  37. "600221" 1813052819999.9998
  38. "600533"  1813052820000.001
  39. "600787" 1813052879999.9968
  40. "601000"      1813052880000
  41. "601000" 1813052940000.0002
  42. "000937" 1813052940000.0022
  43. "000786" 1813052999999.9978
  44. "000615" 1813052999999.9998
  45. "002158"      1813053060000
  46. "600616" 1813053060000.0032
  47. "000965" 1813053119999.9988
  48. "600028"      1813053120000
  49. "300446" 1813053180000.0002
  50. "000856" 1813053180000.0042
  51. "000605" 1813053239999.9998
  52. "000965"  1813053299999.996
  53. "002494"      1.8130533e+12
  54. "300055" 1813053360000.0002
  55. "000958"  1813053360000.001
  56. "000965" 1813053419999.9973
  57. "300055" 1813053419999.9998
  58. "002457"      1813053480000
  59. "000958"  1813053480000.002
  60. "000958" 1813053539999.9983
  61. "300070" 1813053540000.0002
  62. "603969" 1813053599999.9998
  63. "000916"  1813053600000.003
  64. "000937" 1813053659999.9988
  65. "603616"      1813053660000
  66. "002665" 1813053720000.0002
  67. "000897" 1813053720000.0042
  68. "000897" 1813053779999.9998
  69. "000852"  1813053839999.996
  70. "300344"      1813053840000
  71. "002665" 1813053900000.0002
  72. "000852" 1813053900000.0007
  73. "600616"  1813053959999.997
  74. "000616" 1813053959999.9998
  75. "603903"      1813054020000
  76. "600376"  1813054020000.002
  77. "000897"  1813054079999.998
  78. "002310"      1813054080000
  79. "603903" 1813054140000.0002
  80. "000897"  1813054140000.003
  81. "600550" 1813054199999.9993
  82. "002542" 1813054199999.9998
  83. "000415"      1813054260000
  84. "600616"  1813054260000.004
  85. "603569" 1813054320000.0002
  86. "000937" 1813054379999.9958
  87. "002717" 1813054379999.9998
  88. "000616"      1813054440000
  89. "000856" 1813054440000.0012
  90. "000923"  1813054499999.997
  91. "300117" 1813054500000.0002
  92. "300428" 1813054559999.9998
  93. "000958" 1813054560000.0017
  94. "000897"  1813054619999.998
  95. "600266"      1813054620000
  96. "300048" 1813054680000.0002
  97. "000856" 1813054680000.0027
  98. "000897"  1813054739999.999
  99. "002158" 1813054739999.9998
  100. "002457"      1.8130548e+12
  101. "000959"  1813054800000.004
  102. "000959" 1813054860000.0002
  103. "600722" 1813054919999.9958
  104. "601633" 1813054919999.9998
  105. end
  106. format %tc time
复制代码
[/code]

17
奇犽dsp 学生认证  发表于 2018-1-4 11:08:04
  1. [CODE]
  2. * Example generated by -dataex-. To install: ssc install dataex
  3. clear
  4. input str9 ticker double time
  5. "000937"  1813051799999.997
  6. "300107" 1813051800000.0002
  7. "002616" 1813051859999.9998
  8. "600578" 1813051860000.0017
  9. "000937"  1813051919999.998
  10. "600340"      1813051920000
  11. "600221" 1813051980000.0002
  12. "600787" 1813051980000.0027
  13. "000856"  1813052039999.999
  14. "300446" 1813052039999.9998
  15. "300055"      1.8130521e+12
  16. "000937"  1813052100000.004
  17. "600266" 1813052160000.0002
  18. "000923" 1813052219999.9958
  19. "002542" 1813052219999.9998
  20. "600008"      1813052280000
  21. "000923" 1813052280000.0012
  22. "600722" 1813052339999.9968
  23. "002691"      1813052340000
  24. "600266" 1813052400000.0002
  25. "600787" 1813052400000.0022
  26. "600739" 1813052459999.9978
  27. "000616" 1813052459999.9998
  28. "000158"      1813052520000
  29. "000852" 1813052520000.0032
  30. "000897"  1813052579999.999
  31. "000616" 1813052580000.0002
  32. "000401" 1813052639999.9998
  33. "000897" 1813052640000.0037
  34. "000605"      1.8130527e+12
  35. "600376" 1813052759999.9963
  36. "000605" 1813052760000.0002
  37. "600221" 1813052819999.9998
  38. "600533"  1813052820000.001
  39. "600787" 1813052879999.9968
  40. "601000"      1813052880000
  41. "601000" 1813052940000.0002
  42. "000937" 1813052940000.0022
  43. "000786" 1813052999999.9978
  44. "000615" 1813052999999.9998
  45. "002158"      1813053060000
  46. "600616" 1813053060000.0032
  47. "000965" 1813053119999.9988
  48. "600028"      1813053120000
  49. "300446" 1813053180000.0002
  50. "000856" 1813053180000.0042
  51. "000605" 1813053239999.9998
  52. "000965"  1813053299999.996
  53. "002494"      1.8130533e+12
  54. "300055" 1813053360000.0002
  55. "000958"  1813053360000.001
  56. "000965" 1813053419999.9973
  57. "300055" 1813053419999.9998
  58. "002457"      1813053480000
  59. "000958"  1813053480000.002
  60. "000958" 1813053539999.9983
  61. "300070" 1813053540000.0002
  62. "603969" 1813053599999.9998
  63. "000916"  1813053600000.003
  64. "000937" 1813053659999.9988
  65. "603616"      1813053660000
  66. "002665" 1813053720000.0002
  67. "000897" 1813053720000.0042
  68. "000897" 1813053779999.9998
  69. "000852"  1813053839999.996
  70. "300344"      1813053840000
  71. "002665" 1813053900000.0002
  72. "000852" 1813053900000.0007
  73. "600616"  1813053959999.997
  74. "000616" 1813053959999.9998
  75. "603903"      1813054020000
  76. "600376"  1813054020000.002
  77. "000897"  1813054079999.998
  78. "002310"      1813054080000
  79. "603903" 1813054140000.0002
  80. "000897"  1813054140000.003
  81. "600550" 1813054199999.9993
  82. "002542" 1813054199999.9998
  83. "000415"      1813054260000
  84. "600616"  1813054260000.004
  85. "603569" 1813054320000.0002
  86. "000937" 1813054379999.9958
  87. "002717" 1813054379999.9998
  88. "000616"      1813054440000
  89. "000856" 1813054440000.0012
  90. "000923"  1813054499999.997
  91. "300117" 1813054500000.0002
  92. "300428" 1813054559999.9998
  93. "000958" 1813054560000.0017
  94. "000897"  1813054619999.998
  95. "600266"      1813054620000
  96. "300048" 1813054680000.0002
  97. "000856" 1813054680000.0027
  98. "000897"  1813054739999.999
  99. "002158" 1813054739999.9998
  100. "002457"      1.8130548e+12
  101. "000959"  1813054800000.004
  102. "000959" 1813054860000.0002
  103. "600722" 1813054919999.9958
  104. "601633" 1813054919999.9998
  105. end
  106. format %tc time
复制代码

[/code]

18
黃河泉 在职认证  发表于 2018-1-4 11:40:56
奇犽dsp 发表于 2018-1-4 11:08
[/code]
哪里有问题?

19
奇犽dsp 学生认证  发表于 2018-1-4 17:18:02
最后黄老师给出了解决方案,是因为精度问题不能完全删除,得到想要结果。
可以执行以下代码:
  1. gen double round_dt = round(time,60000)
  2. format %tcCCYY-NN-DD_HH:MM:SS.sss time round_dt
  3. list, noobs separator(0)

  4. duplicates drop round_dt, force
复制代码

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-31 08:56