楼主: Trevor
7028 18

[下载][推荐][讨论]Stata Questions and Answers [推广有奖]

11
Trevor 发表于 2006-4-14 23:31:00
Thanks for your reply Maarten, the values of x are exact ones and that
happens whenever you have "50" after the decimal you are cutting, for
example 0.*50, for the one decimal place, 0.**50 for the two decimal
place...etc.

Thanks for you help in advance.

Amani

12
Trevor 发表于 2006-4-14 23:32:00
Another reason for not trusting your example
is that you claim to show us the results of

gen x = round(y, 0.1)

but in your output it is y that is rounded
and x that is coarse. So, evidently you changed
something before showing us your results.

Nick
n.j.cox@durham.ac.uk

Siyam, Amani

13
Trevor 发表于 2006-4-14 23:33:00
Sorry,It is my mistake,

the command line should read
gen y=round(x, 0.1)

y is what I got after that.

Thanks again.

Amani

14
Trevor 发表于 2006-4-14 23:34:00

Amani:


My example shows that it doesn't happen whenever you have 50 after the decimal point. My best guess is there are more decimals in your data than are displayed, leading you to believe that 8.750 is "exact". Try change the format and look at the variable again, like the example below:
format x %23.18f
list x in 1/10

HTH,
Maarten

15
Trevor 发表于 2006-4-14 23:36:00
I can't speak about STATA. In Stata, which behaves
similarly in this instance, you can get rounding up
by using -ceil()-.

But note that you will find it difficult to avoid
small (apparent) anomalies whatever you do.

This because, in general, multiples of 0.1 cannot
be held exactly in binary, as explained many, many
times on this list.

Suppose I have 8.75 (meaning, precisely, 8 + 3/4).

Stata can hold that, _exactly_. In decimal,

. di %23.18f 8.750
8.750000000000000000

or, more to the point, in hexadecimal,

. di %21x 8.750
+1.1800000000000X+003

If I round that, Stata starts to struggle:

. di %21x round(8.750, 0.1)
+1.199999999999aX+003

. di %23.18f round(8.75,0.1)
8.800000000000000700

This raises the question of what, exactly, is
the number you are showing us which is (which
is represented as!) 8.750. Your number is,
I surmise, not exactly 8.750, as if it were
it would round to 8.8.

That said, there is no way you can avoid this kind of
problem by writing your own program, unless
you also construct a decimal-based computer.

Nick
n.j.cox@durham.ac.uk

Siyam, Amani

> I have a minor issue with the way STATA is rounding, for example a
> variable to one decimal place.
>
> I used the command
>
> gen x=round(y, 0.1)
>
> Comparing the two variables, I noticed the that when:
>
> x= y=
>
> 8.750 8.7
> 5.752 5.8
> 23.256 23.3
>
> Is there a way I can modify the function to round "x"=8.750 to be 8.8
>
> Or should I put together my own code to get the rounding I want.

*
* For searches and help try:
*
http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

16
Trevor 发表于 2006-4-14 23:37:00
Amani--

All of what Nick says is true, but the claim that "there is no way you
can avoid this kind of problem by writing your own program, unless you
also construct a decimal-based computer" is overreaching a bit. It's
true that the general problem of numerical precision is unavoidable,
but the specific problem you want to address is not. If you want to
force the incorrect behavior you seem to want (rounding 8.64999999999
to 8.7) there is a way to do it. If you are representing your numbers
as having 2 digits after the decimal point, and you believe that such
a representation is true and the more accurate binary representation
is not, then you can force the rounding you want by treating your
representation as a string--but you could be introducing error in any
setting where your homemade rounding method does not match the
binary-based rounding done by your computer.

. clear
. set obs 100
. gen obs=_n/100
. gen real=round(obs, .1)
. gen silly=round(real(string(obs*100, "%4.0f")),10)/100
. li in 64/66

+--------------------+
| obs real silly |
|--------------------|
64. | .64 .6 .6 |
65. | .65 .6 .7 |
66. | .66 .7 .7 |
+--------------------+

Your original post raises the question of how you got the numerical
8.75 so that you know the decimal representation is exact, but the
computer doesn't (did you do a whole series of calculations that
should result in exactly 35/4 but instead results in a close, but not
close enough, approximation?) If so, you can turn your calculation on
integers like X into calculations using integers by using an
appropriate factor, such as 4X or 40X or 100X, and get the "right"
answer (but the appropriate factor to apply before and after the
intermediate calculations to preserve accuracy will depend on what the
intermediate calculations are). Bear in mind always that

. di round(35/4,.1)
8.8

gives you the answer you seem to want, and you should be able to get
there from here.

--Austin

17
Trevor 发表于 2006-4-14 23:39:00
I'm (basically) comparing a multivariate Probit (using the mvprobit
command) with a method employing inverse distributions and the sureg
command for their subsequent multivariate normal distribution. Even
using normal inversesof univariate probit probabilities however my sureg
log-likelihoods are a good deal higher/less negative than the mvprobit
ones, which is unexpected. I'm wondering whether the sureg perhaps uses
concentrated log-likelihoods. The manual doesn't cover this, but does
anybody out there have any information, or a similar experience?

Casey Quinn
Centre for Health Economics
University of York

18
Trevor 发表于 2006-4-14 23:41:00
I have the following estimation problem: I am estimating a probit of y1
on y2 x1 with instrumental variables z, where y2 is one endogenous
variable, x1 is of dimension K, and z is of dimension L, L>1. I apply a
maximum likelihood optimization as described in
Wooldridge (2002), p. 477f.

I undertake a Rivers-Vuong endogeneity test as described in Wooldridge
(2002), p. 474.

Question: does there exist a test on overidentifying restrictions that
tests the validity of instruments in this case? Or does anybody know
whether the ivreg2 command together with overid is applicable for probit
estimation with binary endogenous variables, too?

19
Trevor 发表于 2006-4-14 23:43:00
I'm currently writing a research proposal for an analysis of a simple
panel dataset with two data points. That is, there's an entry survey
and an exit survey. This is the first time I've analyzed this sort of
data, and wonder if you folks have any thoughts about what issues might
come up other than the obvious. There's considerable variation in the
dependent variable with individuals as the unit of analysis (with
approximately 10,000 cases), but the primary independent variable of
interest is "contextual" or "environmental." That is, it will be
aggregated in a way such that about every 100 to 1,000 individuals will
have same value for the variable. Most of the other control variables
won't be aggregated.

So, does the fact that the independent variable is aggregated raise
methodological issues that undermine the benefits of panel data? I
imagine that a lot of panel studies must account for these environment
variables, but what are the chief pitfalls and how does one adjust for
them? Would it be advisable or necessary to conduct a separate analysis
at a higher level of aggregation and compare the results? Is there a
good reference for this kind of analysis that wouldn't take me a
semester to read and digest?

*

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-26 22:37