人大经济论坛 › 标签 › Stronger

标签: Stronger经管大学堂：名校名师名课

相关帖子	版块	作者	回复/查看	最后发表

What does Hillary stand for? Hillary Clinton in 2016	真实世界经济学(含财经时事)	So^^So 2015-4-17	1 1558	gloriaflora 2015-6-5 04:50:49
The Interrelationship Between Financial and Energy Markets	金融学（理论版）	大家开心 2014-11-12	30 4313	kexinkeqing 2014-12-19 00:51:01
J.P. Morgan Global Markets Outlook and Strategy - October 2014‏	行业分析报告	jhd0314 2014-11-10	5 2696	估值菜鸟 2014-11-17 04:06:39
Industrial Shift: the Structure of the New World Economy by Joe Atikian - [阅读权限 18]	管理科学与工程	tigerwolf 2014-8-16	9 483	kexinkeqing 2014-11-8 01:39:06
Leading Through Uncertainty: How Umpqua Bank Emerged from the Great Recession	金融学（理论版）	dadahawk 2014-7-22	0 1405	dadahawk 2014-7-22 11:07:57
新人求教ROC曲线确定划界分的问题	SPSS论坛	levil0411 2014-6-27	5 19723	biostat 2014-7-7 14:41:05
ISM Index	金融实务版	rainbow19720731 2014-3-16	0 1372	rainbow19720731 2014-3-16 15:37:07
巨献-72P-大摩的商业飞行 blue paper july 22出品	行业分析报告	wlz008 2013-7-26	0 1288	wlz008 2013-7-26 12:17:21
求助一文献 Do democracies exhibit stronger international - [!reward_solved!]	求助成功区	ywh19860616 2013-5-2	2 922	xllbl 2013-5-2 10:26:16
Migrant Labor Markets and the Welfare of Rural Households (China)	论文版	夸克之一 2012-8-12	78 7862	qinpeng1112 2013-3-13 21:38:20
杰富瑞集团：2013年2月全球证券市场投资策略（免费）	行业分析报告	bigfoot0518 2013-2-5	0 1648	bigfoot0518 2013-2-6 00:10:47
汇丰银行：2013年韩国证券市场投资策略（免费）	行业分析报告	bigfoot0518 2013-1-9	0 1727	bigfoot0518 2013-1-10 21:17:59
I think I can 中英双语版 - [阅读权限 5]	外语学习	reduce_fat 2012-7-13	7 351	reduce_fat 2012-8-8 01:57:23
Oil price above $101 as consumer confidence rises	真实世界经济学(含财经时事)	lzguo568 2011-12-28	0 1223	lzguo568 2011-12-28 16:10:21
UBS-Global Economic Comment Stronger Dollar-080908.pdf	金融学（理论版）	xiaochao 2008-9-8	0 1708	xiaochao 2008-9-8 13:52:00

更多...

相关日志

分享 What does Hillary stand for? Hillary Clinton in 2016 4.11: So^^So 2015-4-17 17:50; What does Hillary stand for? Hillary Clinton in 2016 On April 4, The Economist published an article named “ What does Hillary stand for? Hillary Clinton in 2016 ”. Mrs Clinton has had her eye on the top job—president for a long time. In 2008, She nearly won it in 2008 and is in many ways a stronger candidate now. Thegame isriggedbybigmoney. She has built a vast campaign machine, of course, with her husband, the former president. The moment Mrs Clinton turns the key, it will begin openly to suck up contributions, spit out sound bites and roll over her rivals. There is a bitter irony. What we most care about is : what does Hillary stand for? After all, it is a big news this week and this article is at least the cover story. According to her supporters, she flew nearly a million miles and visited 112 countries. If a foreign crisis occurs on her watch, she will already have been there, read the briefing book and had tea with the local power brokers. No other candidate of either party can boast as much. From this case, we know that she has lots of political capital on the surface at least. She also understands Washington, DC, as well as anyone. Mrs Clinton made a habit of listening to, and working with, senators on both sides of the aisle, leading some Republicans publicly to regret having disliked her in the past. On foreign policy, she says she is neither a realist nor an idealist but an "idealistic realist". Charles Schumer, her former Senate colleague from New York, called her "the most opaque person you'll ever meet in your life". On foreign policy, Mrs Clinton's pitch is that she would be tougher than Mr Obama . Many foreigners would welcome an American commander-in-chief who is genuinely engaged with the world outside America. Sceptics raise two further worries about Mrs Clinton . She used a private server for her e-mails as secretary of state, released only the ones she deemed relevant and then deleted the rest. Some people think she is untrustworthy. The other worry, as Gary Hart said, "We should not be down to two families who are qualified to govern." Will Hillary Clinton win in 2016? This article provides us with too much information about disadvantages about her, although it looks like appreciating Mrs Clinton in some degree. Firstly, as we all know, she has spent years in politics so she make acquaintances with many politicians. But this will let people concentrate on family political instead of the first female president . Obama is America's first black president and the news cheered many people up in the past, but it means nothing. Pr stunt about the first female president may let people be repulsive. Secondly, she does not have clear repulsive. As put above, Mrs Clinton is close to Wall Street, but she is also a power-hungry statist. Giving both sides a stake in change is a good strategy to be a politicians, but to be voted as a president, it means you offend both parties. It has a negative effect on some voters, especially them who are in swing states. They will be puzzled and do not support you. Thirdly, sometimes something is a trend. To a certain extent, maybe she is too old to be a president, just younger than Reagan. As a democratic president, what Mr Obama does is unimpressive. Many foreigners would welcome an American commander-in-chief who is genuinely engaged with the world outside American, but not American people. There are too much unpredictable venture during her way to be president. What happens after that remains to be seen .; 个人分类: 每周经济学人评介|40 次阅读|0 个评论

分享 Doing Transformations with Stata: statalearning 2014-6-10 14:22; ------------------------------------------------------------------------------- Transformations: an introduction ------------------------------------------------------------------------------- In data analysis transformation is the replacement of a variable by a function of that variable: for example, replacing a variable x by the square root of x or the logarithm of x. In a stronger sense, a transformation is a replacement that changes the shape of a distribution or relationship. This help does not pretend to be comprehensive or even generous on literature citations. Various references that I have found helpful are sprinkled here and there. Two that have particularly shaped my understanding are Emerson and Stoto (1983) and Emerson (1983). Behind those articles lies the persistent emphasis placed on the value of transformations in the work of John Wilder Tukey (1915-2000). This help item covers the following topics. You can read in sequence or skim directly to each section. Starred sections are likely to appear more esoteric or more difficult than the others to those new to the subject. Reasons for using transformations Review of most common transformations Psychological comments - for the puzzled How to do transformations in Stata * Transformations for proportions and percents * Transformations as a family * Transformations for variables that are both positive and negative Typographical notes: ^ means raise to the power of whatever follows. _ means that whatever follows should be considered a subscript (written below the line). The Stata notation == for "is equal to" and != for "is not equal to" are used for tests of various true-or-false conditions. Reasons for using transformations There are many reasons for transformation. The list here is not comprehensive. 1. Convenience 2. Reducing skewness 3. Equal spreads 4. Linear relationships 5. Additive relationships If you are looking at just one variable, 1, 2 and 3 are relevant, while if you are looking at two or more variables, 4 and 5 are more important. However, transformations that achieve 4 and 5 very often achieve 2 and 3. 1. Convenience A transformed scale may be as natural as the original scale and more convenient for a specific purpose (e.g. percentages rather than original data, sines rather than degrees). One important example is standardisation , whereby values are adjusted for differing level and spread. In general value - level standardised value = -------------. spread Standardised values have level 0 and spread 1 and have no units: hence standardisation is useful for comparing variables expressed in different units. Most commonly a standard score is calculated using the mean and standard deviation (sd) of a variable: x - mean of x z = -------------. sd of x Standardisation makes no difference to the shape of a distribution. 2. Reducing skewness A transformation may be used to reduce skewness. A distribution that is symmetric or nearly so is often easier to handle and interpret than a skewed distribution. More specifically, a normal or Gaussian distribution is often regarded as ideal as it is assumed by many statistical methods. To reduce right skewness, take roots or logarithms or reciprocals (roots are weakest). This is the commonest problem in practice. To reduce left skewness, take squares or cubes or higher powers. 3. Equal spreads A transformation may be used to produce approximately equal spreads, despite marked variations in level, which again makes data easier to handle and interpret. Each data set or subset having about the same spread or variability is a condition called homoscedasticity : its opposite is called heteroscedasticity . (The spelling -sked- rather than -sced- is also used.) 4. Linear relationships When looking at relationships between variables, it is often far easier to think about patterns that are approximately linear than about patterns that are highly curved. This is vitally important when using linear regression, which amounts to fitting such patterns to data. (In Stata, regress is the basic command for regression.) For example, a plot of logarithms of a series of values against time has the property that periods with constant rates of change (growth or decline) plot as straight lines. 5. Additive relationships Relationships are often easier to analyse when additive rather than (say) multiplicative. So y = a + bx in which two terms a and bx are added is easier to deal with than y = ax^b in which two terms a and x^b are multiplied. Additivity is a vital issue in analysis of variance (in Stata, anova, oneway, etc.). In practice, a transformation often works, serendipitously, to do several of these at once, particularly to reduce skewness, to produce nearly equal spreads and to produce a nearly linear or additive relationship. But this is not guaranteed. Review of most common transformations The most useful transformations in introductory data analysis are the reciprocal, logarithm, cube root, square root, and square. In what follows, even when it is not emphasised, it is supposed that transformations are used only over ranges on which they yield (finite) real numbers as results. Reciprocal The reciprocal , x to 1/x, with its sibling the negative reciprocal , x to -1/x, is a very strong transformation with a drastic effect on distribution shape. It can not be applied to zero values. Although it can be applied to negative values, it is not useful unless all values are positive. The reciprocal of a ratio may often be interpreted as easily as the ratio itself: e.g. population density (people per unit area) becomes area per person; persons per doctor becomes doctors per person; rates of erosion become time to erode a unit depth. (In practice, we might want to multiply or divide the results of taking the reciprocal by some constant, such as 1000 or 10000, to get numbers that are easy to manage, but that itself has no effect on skewness or linearity.) The reciprocal reverses order among values of the same sign: largest becomes smallest, etc. The negative reciprocal preserves order among values of the same sign. Logarithm The logarithm , x to log base 10 of x, or x to log base e of x (ln x), or x to log base 2 of x, is a strong transformation with a major effect on distribution shape. It is commonly used for reducing right skewness and is often appropriate for measured variables. It can not be applied to zero or negative values. One unit on a logarithmic scale means a multiplication by the base of logarithms being used. Exponential growth or decline y = a exp(bx) is made linear by ln y = ln a + bx so that the response variable y should be logged. (Here exp() means raising to the power e, approximately 2.71828, that is the base of natural logarithms.) An aside on this exponential growth or decline equation: put x = 0, and y = a exp(0) = a, so that a is the amount or count when x = 0. If a and b 0, then y grows at a faster and faster rate (e.g. compound interest or unchecked population growth), whereas if a 0 and b 0, y declines at a slower and slower rate (e.g. radioactive decay). Power functions y = ax^b are made linear by log y = log a + b log x so that both variables y and x should be logged. An aside on such power functions : put x = 0, and for b 0, y = ax^b = 0, so the power function for positive b goes through the origin, which often makes physical or biological or economic sense. Think: does zero for x imply zero for y? This kind of power function is a shape that fits many data sets rather well. Consider ratios y = p / q where p and q are both positive in practice. Examples are males / females; dependants / workers; downstream length / downvalley length. Then y is somewhere between 0 and infinity, or in the last case, between 1 and infinity. If p = q, then y = 1. Such definitions often lead to skewed data, because there is a clear lower limit and no clear upper limit. The logarithm, however, namely log y = log p / q = log p - log q, is somewhere between -infinity and infinity and p = q means that log y = 0. Hence the logarithm of such a ratio is likely to be more symmetrically distributed. Cube root The cube root , x to x^(1/3). This is a fairly strong transformation with a substantial effect on distribution shape: it is weaker than the logarithm. It is also used for reducing right skewness, and has the advantage that it can be applied to zero and negative values. Note that the cube root of a volume has the units of a length. It is commonly applied to rainfall data. Applicability to negative values requires a special note. Consider (2)(2)(2) = 8 and (-2)(-2)(-2) = -8. These examples show that the cube root of a negative number has negative sign and the same absolute value as the cube root of the equivalent positive number. A similar property is possessed by any other root whose power is the reciprocal of an odd positive integer (powers 1/3, 1/5, 1/7, etc.). This property is a little delicate. For example, change the power just a smidgen from 1/3, and we can no longer define the result as a product of precisely three terms. However, the property is there to be exploited if useful. Square root The square root , x to x^(1/2) = sqrt(x), is a transformation with a moderate effect on distribution shape: it is weaker than the logarithm and the cube root. It is also used for reducing right skewness, and also has the advantage that it can be applied to zero values. Note that the square root of an area has the units of a length. It is commonly applied to counted data, especially if the values are mostly rather small. Square The square , x to x^2, has a moderate effect on distribution shape and it could be used to reduce left skewness. In practice, the main reason for using it is to fit a response by a quadratic function y = a + b x + c x^2. Quadratics have a turning point, either a maximum or a minimum, although the turning point in a function fitted to data might be far beyond the limits of the observations. The distance of a body from an origin is a quadratic if that body is moving under constant acceleration, which gives a very clear physical justification for using a quadratic. Otherwise quadratics are typically used solely because they can mimic a relationship within the data region. Outside that region they may behave very poorly, because they take on arbitrarily large values for extreme values of x, and unless the intercept a is constrained to be 0, they may behave unrealistically close to the origin. Squaring usually makes sense only if the variable concerned is zero or positive, given that (-x)^2 and x^2 are identical. Which transformation? The main criterion in choosing a transformation is: what works with the data? As examples above indicate, it is important to consider as well two questions. What makes physical (biological, economic, whatever) sense, for example in terms of limiting behaviour as values get very small or very large? This question often leads to the use of logarithms. Can we keep dimensions and units simple and convenient? If possible, we prefer measurement scales that are easy to think about. The cube root of a volume and the square root of an area both have the dimensions of length, so far from complicating matters, such transformations may simplify them. Reciprocals usually have simple units, as mentioned earlier. Often, however, somewhat complicated units are a sacrifice that has to be made. Psychological comments - for the puzzled The main motive for transformation is greater ease of description. Although transformed scales may seem less natural, this is largely a psychological objection. Greater experience with transformation tends to reduce this feeling, simply because transformation so often works so well. In fact, many familiar measured scales are really transformed scales: decibels, pH and the Richter scale of earthquake magnitude are all logarithmic. However, transformations cause debate even among experienced data analysts. Some use them routinely, others much less. Various views, extreme or not so extreme, are slightly caricatured here to stimulate reflection or discussion. For what it is worth, I consider all these views defensible, or at least understandable. "This seems like a kind of cheating. You don't like how the data are, so you decide to change them." "I see that this is a clever trick that works nicely. But how do I know when this trick will work with some other data, or if another trick is needed, or if no transformation is needed?" "Transformations are needed because there is no guarantee that the world works on the scales it happens to be measured on." "Transformations are most appropriate when they match a scientific view of how a variable behaves." Often it helps to transform results back again, using the reverse or inverse transformation: reciprocal t = 1 / x reciprocal x = 1 / t log base 10 t = log_10 x 10 to the power x = 10^t log base e t = log_e x = ln x e to the power x = exp(t) log base 2 t = log_2 x 2 to the power x = 2^t cube root t = x^(1/3) cube x = t^3 square root t = x^(1/2) square x = t^2 How to do transformations in Stata Basic first steps 1. Draw a graph of the data to see how far patterns in data match the simplest ideal patterns. Try dotplot or scatter as appropriate. 2. See what range the data cover. Transformations will have little effect if the range is small. 3. Think carefully about data sets including zero or negative values. Some transformations are not defined mathematically for some values, and often they make little or no scientific sense. For example, I would never transform temperatures in degrees Celsius or Fahrenheit for these reasons (unless to Kelvin). Standard scores (mean 0 and sd 1) in a new variable are obtained by . egen stdpopi = std(popi) whereas the basic transformations can all be put in new variables by generate: . gen recener = 1/energy . gen logeener = ln(energy) . gen l10ener = log10(energy) . gen curtener = energy^(1/3) . gen sqrtener = sqrt(energy) . gen sqener = energy^2 . gen logitp = logit(p) if p is a proportion . gen logitp = logit(p / 100) if p is a percent . gen frootp = sqrt(p) - sqrt(1-p) if p is a proportion . gen frootp = sqrt(p) - sqrt(100-p) if p is a percent Cube roots of negative numbers require special care. Stata uses a general routine to calculate powers and does not look for special cases of powers. Whenever negative values are present, a more general recipe for cube roots is sign(x) * (abs(x)^(1/3)) . Similar comments apply to fifth, seventh, roots etc. Note any messages about missing values carefully: unless you had missing values in the original variable, they indicate an attempt to apply a transformation when it is not defined. (Do you have zero or negative values, for example?) It is not always necessary to create a transformed variable before working with it. In particular, many graph commands allow the options yscale(log) and xscale(log) . This is very useful because the graph is labelled using the original values, but it does not leave behind a log-transformed variable in memory. Other commands Stata offers various other commands designed to help you choose a transformation. ladder, gladder and qladder try several transformations of a variable with the aim of showing how far they produce a more nearly normal (Gaussian) distribution. In practice such commands can be helpful, or they can be confusing at an introductory level: for examples, they can suggest a transform at odds with what your scientific knowledge would indicate. boxcox and lnskew0 are more advanced commands that should be used only after studying textbook explanations of what they do. Box and Cox (1964) is the key original reference. For some statistical people any debate about transformation is largely side-stepped by the advent of generalised linear models . In such models, estimation is carried out on a transformed scale using a specified link function, but results are reported on the original scale of the response. The Stata command is glm. Transformations for proportions and percents (more advanced) Data that are proportions (between 0 and 1) or percents (between 0 and 100) often benefit from special transformations. The most common is the logit (or logistic) transformation, which is logit p = log (p / (1 - p)) for proportions OR logit p = log (p / (100 - p)) for percents where p is a proportion or percent. This transformation treats very small and very large values symmetrically, pulling out the tails and pulling in the middle around 0.5 or 50%. The plot of p against logit p is thus a flattened S-shape. Strictly, logit p cannot be determined for the extreme values of 0 and 1 (100%): if they occur in data, there needs to be some adjustment. One justification for this logit transformation might be sketched in terms of a diffusion process such as the spread of literacy. The push from zero to a few percent might take a fair time; once literacy starts spreading its increase becomes more rapid and then in turn slows; and finally the last few percent may be very slow in converting to literacy, as we are left with the isolated and the awkward, who are the slowest to pick up any new thing. The resulting curve is thus a flattened S-shape against time, which in turn is made more nearly linear by taking logits of literacy. More formally, the same idea might be justified by imagining that adoption (infection, whatever) is proportional to the number of contacts between those who do and those who do not, which will rise and then fall quadratically. More generally, there are many relationships in which predicted values cannot logically be less than 0 or more than 1 (100%). Using logits is one way of ensuring this: otherwise models may produce absurd predictions. The logit (looking only at the case of proportions) logit p = log (p / (1 - p)) can be rewritten logit p = log p - log (1 - p) and in this form it can be seen as a member of a set of folded transformations transform of p = something done to p - something done to (1 - p). This way of writing it brings out the symmetrical way in which very high and very low values are treated. (If p is small, 1 - p is large, and vice versa.) The logit is occasionally called the folded log . The simplest other such transformation is the folded root (that means square root) folded root of p = root of p - root of (1 - p). As with square roots and logarithms generally, the folded root has the advantage that it can be applied without adjustment to data values of 0 and 1 (100%). The folded root is a weaker transformation than the logit. In practice it is used far less frequently. Two other transformations for proportions and percents met in the older literature (and still used occasionally) are the angular and the probit . The angular is arcsin(root of p) or the angle whose sine is the square root of p. In practice, it behaves very like p^0.41 - (1 - p)^0.41, which in turn is close to p^0.5 - (1 - p)^0.5, which is another way of writing the folded root (Tukey 1960). The probit is a transformation with a mathematical connection to the normal (Gaussian) distribution, which is not only very similar in behaviour to the logit, but also more awkward to work with. As a result, it is now less seen, except in more advanced applications, where it retains several advantages. Transformations as a family (more advanced) The main transformations mentioned previously, with the exception of the logarithm, namely the reciprocal, cube root, square root and square, are all powers. The powers concerned are reciprocal -1 cube root 1/3 square root 1/2 square 2 Note that the sequence of explanation was not capricious, but in numerical order of power. Therefore, these transformations are all members of a family. In addition, contrary to what may appear at first sight, the logarithm really belongs in the family too. Knowing this is important to appreciating that the transformations used in practice are not just a bag of tricks, but a series of tools of different sizes or strengths, like a set of screwdrivers or drill bits. We could thus fill out this sequence, the ladder of transformations as it is sometimes known, with more powers, as for example in reciprocal square -2 reciprocal -1 (yields one) 0 cube root 1/3 square root 1/2 identity 1 square 2 cube 3 fourth power 4 Among the additions here, the identity transformation, say x^1 = x, is the transformation that is, in a sense, no transformation. The graph of x against x is naturally a straight line and so the power of 1 divides transformations whose graph is convex upwards (powers less than 1) from transformations whose graph is concave upwards (powers greater than 1). Powers less than 1 squeeze high values together and stretch low values apart, and powers more than 1 do the opposite. The transformation x^0, on the other hand, is degenerate, as it always yields 1 as a result. However, we will now see that in a strong sense log x (meaning, strictly, the natural logarithm or ln x) really belongs in the family at the position of power 0. If you know calculus, you will know that the sequence of powers ..., x^-3, x^-2, x^-1, x^0, x^1, x^2, ... has as integrals, apart from additive constants, ..., -x^-2 / 2, -x^-1, ln x, x, x^2 / 2, x^3 / 3, ... and the mapping can be reversed by differentiation. So integrating x^(p - 1) yields x^p / p, unless p is 0, in which case it yields ln x. Thus we can define a family t_p(x) = x^p if p != 0, = ln x if p == 0. The notion of choosing from a family when we choose a power or logarithm is a key idea. It follows that we can usually choose a different member of the family if the transformation turns out to be too weak, or too strong, for our purpose and our data. Many discussions of transformations focus on slightly different families, for a variety of mathematical and statistical reasons. The canonical reference here is Box and Cox (1964), although note also earlier work by Tukey (1957). Most commonly, the definition is changed to t_p(x) = (x^p - 1) / p if p != 0, = ln x if p == 0. This t(x, p) has various properties which point up family resemblances. 1. ln x is the limit as p - 0 of (x^p - 1) / p. 2. At x = 1, t_p(x) = 0, for all p. 3. The first derivative (rate of change) of t_p(x) is x^(p - 1) if p != 0 and 1 / x if p == 0. At x = 1, this is always 1. 4. The second derivative of t_p(x) is (p - 1) x^(p - 2) if p != 0 and -1 / x^2 if p == 0. At x = 1, this is always (p - 1). Another small change of definition has some similar consequences, but also some other advantages. Consider t_p(x) = / p if p != 0, = ln(x + 1) if p == 0. This t(x, p) has various properties which also point up family resemblances. 1. If p = 1, t_p(x) = x. 2. At x = 0, t_p(x) = 0, for all p. So all curves start at the origin. 3. The first derivative (rate of change) of t_p(x) is (x + 1)^(p - 1) if p != 0 and 1 / (x + 1) if p == 0. At x = 0, this is always 1. So the curves have the same slope at the origin. 4. The second derivative of t_p(x) is (p - 1) (x + 1)^(p - 2) if p != 0 and -1 / (x + 1)^2 if p == 0. At x = 0, this is always (p - 1). The most useful consequence, however, is that this definition can be extended more easily to variables that can be both positive and negative, as will now be seen. Transformations for variables that are both positive and negative (more advance d) Most of the literature on transformations focuses on one or both of two related situations: the variable concerned is strictly positive; or it is zero or positive. If the first situation does not hold, some transformations do not yield real number results (notably, logarithms and reciprocals); if the second situation does not hold, then some other transformations do not yield real number results or more generally do not appear useful (notably, square roots or squares). However, in some situations response variables in particular can be both positive and negative. This is common whenever the response is a balance, change, difference or derivative. Although such variables are often skew, the most awkward property that may invite transformation is heavy (long or fat) tails, high kurtosis in one terminology. Zero usually has a strong substantive meaning, so that we wish to preserve the distinction between negative, zero and positive values. (Note that Celsius or Fahrenheit temperatures do not really qualify here, as their zero points are statistically arbitrary, for all the importance of whether water melts or freezes.) In these circumstances, experience with right-skewed and strictly positive variables might suggest looking for a transformation that behaves like ln x when x is positive and like -ln(-x) when x is negative. This still leaves the problem of what to do with zeros. In addition, it is clear from any sketch that (in Stata terms) cond(x = 0, -ln(-x), ln(x)) would be useless. One way forward is to use -ln(-x + 1) if x = 0, ln(x + 1) if x 0. This can also be written sign(x) ln(|x| + 1) where sign(x) is 1 if x 0, 0 if x == 0 and -1 if x 0. This function passes through the origin, behaves like x for small x, positive and negative, and like sign(x) ln(abs(x)) for large |x|. The gradient is steepest at 1 at x = 0, so the transformation pulls in extreme values relative to those near the origin. It has recently been dubbed the neglog transformation (Whittaker et al. 2005). An earlier reference is John and Draper (1980). In Stata language, this could be cond(x = 0, -ln(-x + 1), ln(x + 1)) or sign(x) * ln(abs(x) + 1) The inverse transformation is cond(t = 0, 1 - exp(-t), exp(t) - 1) A suitable generalisation of powers other than 0 is - / p if x = 0, / p if x 0. Transformations that affect skewness as well as heavy tails in variables that are both positive and negative were discussed by Yeo and Johnson (2000). Another possibility in this terrain is to apply the inverse hyperbolic function arsinh (also known as arg sinh, sinh^-1 and arcsinh). This is the inverse of the sinh function, which in turn is defined as sinh(x) = (exp(x) - exp(-x)) / 2. The sinh and arsinh functions can be computed in Mata as sinh(x) and asinh(x) and in Stata as (exp(x) - exp(-x))/2 and ln(x + sqrt(x^2 + 1)) . The arsinh function also too passes through the origin and is steepest at the origin. For large |x| it behaves like sign(x) ln(|2x|). So in practice neglog(x) and arsinh(x) have loosely similar effects. See also Johnson (1949). Acknowledgements Austin Nichols pointed out that cube roots are well defined for negative values. Author Nicholas J. Cox, Durham University n.j.cox@durham.ac.uk (last major revision 29 November 2005; corrections and minor revisions 8 November 2006, 25 July 2007) Postscript I came across the following in a text on calculus. Transformation of a function into a form in which it can readily be integrated can be effected by suitable algebraical substitutions in which the independent variable is changed. The forms these take will depend on the kind of function to be integrated and, in general, experience and experiment must guide the student. The general aim will be to simplify the function so that it may become easier to integrate. (Abbott 1940, p.184) Modulo some small changes in terminology, this applies here too. Either way, the advice that "experience and experiment must guide the student" is not much comfort to the beginner looking for guidance! References Abbott, P. 1940. Teach Yourself Calculus. London: English Universities Press. Box, G.E.P. and D.R. Cox. 1964. An analysis of transformations. Journal of the Royal Statistical Society B 26: 211-252. Emerson, J.D. 1983. Mathematical aspects of transformation. In Hoaglin, D.C., F. Mosteller and J.W. Tukey (eds) Understanding Robust and Exploratory Data Analysis. New York: John Wiley, 247-282. Emerson, J.D. and M.A. Stoto. 1983. Transforming data. In Hoaglin, D.C., F. Mosteller and J.W. Tukey (eds) Understanding Robust and Exploratory Data Analysis. New York: John Wiley, 97-128. John, J.A. and N.R. Draper. 1980. An alternative family of transformations. Applied Statistics 29: 190-197. Johnson, N.L. 1949. Systems of frequency curves generated by methods of translation. Biometrika 36: 149-176. Tukey, J.W. 1957. On the comparative anatomy of transformations. Annals of Mathematical Statistics 28: 602-632. Tukey, J. W. 1960. The practical relationship between the common transformations of percentages or fractions and of amounts. Reprinted in Mallows, C.L. (ed.) 1990. The Collected Works of John W. Tukey. Volume VI: More Mathematical. Pacific Grove, CA: Wadsworth Brooks-Cole, 211-219. Whittaker, J., J. Whitehead and M. Somers. 2005. The neglog transformation and quantile regression for the analysis of a large credit scoring database. Applied Statistics 54: 863-878. Yeo, I. and R.A. Johnson. 2000. A new family of power transformations to improve normality or symmetry. Biometrika 87: 954-959. Also see On-line: generate, egen, graph; 0 个评论

分享 Personal Income And Spending Weigh On Economic Recovery Hopes: insight 2012-12-3 11:04; Personal Income And Spending Weigh On Economic Recovery Hopes Submitted by Tyler Durden on 12/02/2012 11:28 -0500 Core CPI CPI Debt Ceiling Global Warming Gross Domestic Product Guest Post Personal Consumption Personal Income Rate of Change Recession recovery Savings Rate Via Lance Roberts of Street Talk Live , The personal income and spending report Friday morning left a lot to be desired for those expecting a stronger economic environment soon. However, the report fell well in line with what I have been expecting over the past several months (see here , here and here ) as the drag on real wages and incomes have weighed on the consumer. As we discussed in yesterday's report on GDP - personal consumption makes up more the 70% of the economy therefore changes to employment, incomes or credit has an immediate and significant impact to growth. First, let me argue the claim that the impact to personal incomes was due to Hurricane Sandy. While the storm is going to be the excuse for everything from economic reports to global warming the impact from Sandy on personal incomes was most likely very limited. The storm did not occur until the last two days of the month. Even if we assume that everyone in the Northeast was hourly pay, all quit their job five days before the hurricane, and then left town, the overall impact to the entire month of personal incomes for the entire country would still be fairly limited. Secondly, the Hurricane excuse doesn't account for the negative revisions to the personal income data going back to April of this year. The chart below shows the level of personal incomes both pre- and post revisions in October. These revisions also resolve some the imbalances that we have noted between reported personal income data and other economic reports pre-election. We have suggested that many of these anomalies would be revised away in the months ahead which we are now seeing come to fruition. However, for the sake of argument let's assume that the BEA is correct in their statement that 24 states were affected by Sandy for a total of about $18 billion at an annual rate . This still doesn't explain the complete lack of income growth nationwide. The chart below shows the contributions to personal incomes over the last months. More curious was the very large jump in interest income for the month of October after two previous months of decline. Absent that bump in interest income overall personal incomes would have been negative for the month. However, in the next month or two we should see the estimates used to account for the impact of the storm revised with actual data. This could show a minor increase to the October data. Moving on to personal spending it is not surprising that the previous estimates to spending were likewise revised down in October to reflect weaker income growth . The chart below shows the revision to the major categories of spending for the months of July, August and September. These negative revisions show that spending in the previous months was far less robust than previously estimated which is likely to lead to a downward revision of Q3 GDP next month. The continuing problem that faces the economy remains the impact of rising cost of living which is offsetting increases in compensation. We stated in our last report that: "...it is important to note that wage and salary disbursements have risen since the beginning of the year which has contributed to the increase in personal incomes. However, the recent rise in wages has been very nascent and has come very late in the current economic expansion. Secondly, the rise in wages has been more than offset by a large surge in food and energy costs in recent months as shown in the chart below. While the annualized rate of change in wage and salary disbursements rose again September continuing a steady trend since the beginning of this year, food and energy as a percentage of wages and salaries surged substantially more . The problem with this is that it grossly impacts the consumer. In the most recent report - personal incomes rose by 0.4% while consumer spending surged by 0.8%. Unfortunately, when spending outstrips income the difference has to come either from savings or credit. The chart below shows food and energy as a percentage of disposable personal incomes (DPI) versus the personal savings rate as a percentage of DPI. See the problem here? While core CPI remains very mild - rising food and energy costs at the headline have an immediate impact on the consumer's ability to make ends meet ." This is still the case this month. In October that annualized rate of change in wages showed an increase of 3.04% which was enough of an increase to keep food and energy at 22% of wages. Are We There Yet? When it comes to the economy, and particularly the ongoing recession watch that has nearly become a sporting event, it is real (inflation adjusted) incomes that matter. In the most recent report we see that real personal incomes declined for the month from $11,546 to $11,532 billion for the month reflecting a -.12 change. Doug Short always does an excellent job of tracking the four primary indicators used in distinguishing periods of economic expansion from contraction: "At this point, with all indicators for October on the books, the average of the Big Four (the gray line in the chart above) shows us that economic expansion since the last recession has been hovering around a flat line for the past seven months . Are we tipping into a recession? ECRI has reinforced its claim that we are in a recession and puts the cycle peak in July (more here ). On the other hand, a post-Sandy rebound, good holiday sales and favorably received outcome to Fiscal Cliff negotiations could easily put the economy into indisputable expansion mode. As for the recent data, of course they are subject to revision, so we must view these numbers accordingly." He is correct in his assessment that we are not currently in a recession. However, I am not optimistic the post-Sandy rebound will be enough to stem the tide of contracting wage growth. I am also less than convinced that Washington will come to a resolution for the fiscal cliff and the debt ceiling before it impacts the economy further. The next couple of months will be very telling about the strength of the underlying economy. The manufacturing data continues to point to further economic weakness, hiring plans have deteriorated and the main drivers of economic growth have all stagnated. With the recession in Europe continuing to erode exports, and impact corporate profitability, this leaves investors exposed to a sharp valuation adjustment in the months ahead. While we can hope to get lucky that things will work out for the best - "hope" rarely works out as an investment strategy. Average: 4 Your rating: None Average: 4 ( 2 votes) Tweet Login or register to post comments 2960 reads Printer-friendly version Send to friend Similar Articles You Might Enjoy: Guest Post: CFNAI: Not Seeing The Growth Economists' Predict Overnight Sentiment: Cloudy, If Not Quite Frankenstormy Guest Post: GDP And Durable Goods - Heading To Recession? Europe's Recessionary Collapse Beating Even Most Optimistic Expectations Guest Post: New Home Sales - Not As Strong As Headlines Suggest; 36 次阅读|0 个评论

更多...

京ICP备16021002号-2 京B2-20170662号京公网安备 11010802022788号论坛法律顾问：王进律师知识产权保护声明免责及隐私声明

标签: Stronger经管大学堂：名校名师名课

相关帖子

相关日志