【latex版】水贴 [推广有奖]

1661楼

oliyiyi 发表于 2015-12-18 14:56:46

Here is a simplified illustration of Bayesian inference when data are noisy. Suppose
there is a manufacturer of inflated bouncy balls, and the balls are produced in four
discrete sizes, namely diameters of 1.0, 2.0, 3.0, and 4.0 (on some scale of distance
such as decimeters). The manufacturing process is quite variable, however, because of
randomness in degrees of inflation even for a single size ball. Thus, balls of manufactured
size 3 might have diameters of 1.8 or 4.2, even though their average diameter is 3.0.
Suppose we submit an order to the factory for three balls of size 2.We receive three balls
and measure their diameters as best we can, and find that the three balls have diameters
of 1.77, 2.23, and 2.70. From those measurements, can we conclude that the factory
correctly sent us three balls of size 2, or did the factory send size 3 or size 1 by mistake,
or even size 4?

1662楼

nabula_789 发表于 2015-12-18 14:58:10

好资料！

1663楼

oliyiyi 发表于 2015-12-18 15:00:39

Identify the data relevant to the research questions.What are the measurement scales
of the data? Which data variables are to be predicted, and which data variables are
supposed to act as predictors?

1664楼

oliyiyi 发表于 2015-12-18 15:01:30

Define a descriptive model for the relevant data. The mathematical form and its
parameters should be meaningful and appropriate to the theoretical purposes of the
analysis.

1665楼

oliyiyi 发表于 2015-12-18 15:03:00

缺少币币的网友请访问有奖回帖集合：
https://bbs.pinggu.org/thread-3990750-1-1.html

1666楼

oliyiyi 发表于 2015-12-18 15:09:33

The second step is to define a descriptive model of the data that is meaningful
for our research interest. At this point, we are interested merely in identifying a basic
trend between weight and height, and it is not absurd to think that weight might be
proportional to height, at least as an approximation over the range of adult weights and
heights. Therefore, we will describe predicted weight as a multiplier times height plus a
baseline. We will denote the predicted weight as ˆy (spoken “y hat”), and we will denote
the height as x. Then the idea that predicted weight is a multiple of height plus a baseline
can be denoted mathematically as follows:

1667楼

oliyiyi 发表于 2015-12-18 15:10:56

The coefficient, β1 (Greek letter “beta”), indicates how much the predicted weight
increases when the height goes up by one inch.2 The baseline is denoted β0 in
Equation 2.1, and its value represents the weight of a person who is zero inches tall.
You might suppose that the baseline value should be zero, a priori, but this need not be
the case for describing the relation between weight and height of mature adults, who
have a limited range of height values far above zero. Equation 2.1 is the form of a line,

1668楼

oliyiyi 发表于 2015-12-18 15:12:10

缺少币币的网友请访问有奖回帖集合：
https://bbs.pinggu.org/thread-3990750-1-1.html

1669楼

oliyiyi 发表于 2015-12-18 15:13:03

As outlined above, Bayesian data analysis is based on meaningfully parameterized
descriptive models. Are there ever situations in which such models cannot be used or
are not wanted?
One situation in which itmight appear that parameterizedmodels are not used is with
so-called nonparametric models. But these models are confusingly named because they
actually do have parameters; in fact they have a potentially infinite number of parameters.
As a simple example, suppose we want to describe the weights of dogs. We measure the
weights of many different dogs sampled at random from the entire spectrum of dog
breeds. The weights are probably not distributed unimodally, instead there are probably
subclusters of weights for different breeds of dogs. But some different breeds might
have nearly identical distributions of weights, and there are many dogs that cannot be
identified as a particular breed, and, as we gather data from more and more dogs, we
might encounter members of new subclusters that had not yet been included in the
previously collected data. Thus, it is not clear how many clusters we should include
in the descriptive model. Instead, we infer, from the data, the relative credibilities of
different clusterings. Because each cluster has its own parameters (such as location and
scale parameters), the number of parameters in the model is inferred, and can grow
to infinity with infinite data. There are many other kinds of infinitely parameterized
models. For a tutorial on Bayesian nonparametricmodels, see Gershman and Blei (2012);
for a recent review, see Müller and Mitra (2013); and for textbook applications, see
Gelman et al. (2013). We will not be considering Bayesian nonparametric models in
this book.

1670楼

oliyiyi 发表于 2015-12-18 15:14:27

There are a variety of situations in which it might seem at first that no parameterized
model would apply, such as figuring out the probability that a person has some rare
disease if a diagnostic test for the disease is positive. But Bayesian analysis does apply even
here, although the parameters refer to discrete states instead of continuous distributions.
In the case of disease diagnosis, the parameter is the underlying health status of the
individual, and the parameter can have one of two values, either “has disease” or “does
Introduction: Credibility, Models, and Parameters 31
not have disease.” Bayesian analysis re-allocates credibility over those two parameter
values based on the observed test result. This is exactly analogous to the discrete
possibilities considered by Sherlock Holmes in Figure 2.1, except that the test results
yield probabilistic information instead of perfectly conclusive information. We will do
exact Bayesian computations for this sort of situation in Chapter 5 (see specifically
Table 5.4).