金融学前沿研究R代码实现 - 第2页

11楼

发表于 2020-12-7 11:17:12

Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $s$ be denoted $\boldsymbol{\theta}^{(s)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as
$$
\frac{1}{S}\sum_{s=1}^{S}\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c} \, | \, \boldsymbol{y}_j,\boldsymbol{\theta}^{(s)}).
$$

12楼

tulipsliu

发表于 2020-12-7 13:02:14

$$
\mathrm{logit} [ \Pr(y_{ij} = 1 | \theta_j, \alpha_i, \beta_i) ] =
\alpha_i (\theta_j - \beta_i)
$$
$$
\log \alpha_i, \beta_i \sim \mathrm{MVN}(\mu_1, \mu_2, \Sigma)
$$
$$
\theta_p \sim \mathrm{N}(0, 1)
$$
Variables:

* $i = 1 \ldots I$ indexes items
* $j = 1 \ldots J$ indexes persons
* $y_{ij} \in \{ 0,1 \}$ is the response of person $j$ to item $i$

Parameters:

* $\alpha_i$ is the discrimination for item $i$
* $\beta_i$ is the difficulty for item $i$
* $\theta_j$ is the ability for person $j$
* $\mu_1$ is the mean for $\log \alpha_i$
* $\mu_2$ is the mean for $\beta_i$
* $\Sigma$ is the covariance matrix for $\log \alpha_i$ and $\beta_i$

Priors:

* $\mu_1 \sim \mathrm{N}(0,1)$ is a weakly informative prior for the mean of the log discrimination parameters.
* $\mu_2 \sim \mathrm{N}(0,25)$ is a weakly informative prior for the mean of the difficulty parameters.
* Let $\tau_1^2 = \Sigma_{1,1}$ be the variance of the log discrimination parameters. Then $\tau \sim \mathrm{Exp}(.1)$ is a weakly informative prior for the standard deviation.
* Let $\tau_2^2 = \Sigma_{2,2}$ be the variance of the difficulty parameters. Then $\tau_2 \sim \mathrm{Exp}(.1)$ is a weakly informative prior for the standard deviation.
* A weakly informative prior is placed on the covariance, $\Sigma_{1,2} = \Sigma_{2,1}$, shrinking the posterior towards zero. This is described in more detail in the next section.

13楼

tulipsliu

发表于 2020-12-7 13:06:31

$$
\Pr(Y_{ij} = y,~y > 0 | \theta_j, \beta_i, \kappa_s) =
\frac{\exp \sum_{s=1}^y (\theta_j - \beta_i - \kappa_s)}
{1 + \sum_{k=1}^{m} \exp \sum_{s=1}^k (\theta_j - \beta_i - \kappa_s)}
$$

$$
\Pr(Y_{ij} = y,~y = 0 | \theta_j, \beta_i, \kappa_s) =
\frac{1}
{1 + \sum_{k=1}^{m} \exp \sum_{s=1}^k (\theta_j - \beta_i - \kappa_s)}
$$

$$
\theta_j \sim \mathrm{N}(w_{j}' \lambda, \sigma^2)
$$

Variables:

* $i = 1 \ldots I$ indexes items.
* $j = 1 \ldots J$ indexes persons.
* $Y_{ij} \in \{ 0 \ldots m \}$ is the response of person $j$ to item $i$
* $m$ is simultaneously the maximum score and number of step difficulty parameters per item.
* $w_{j}$ is the vector of covariates for person $j$, the first element of which *must* equal one for a model intercept. $w_{j}$ may be assembled into a $J$-by-$K$ covariate matrix $W$, where $K$ is number of elements in $w_j$.

Parameters:

* $\beta_i$ is the item-specific difficulty for item $i$.
* $\kappa_s$ is the $s$-th step difficulty, constant across items.
* $\theta_j$ is the ability for person $j$.
* $\lambda$ is a vector of latent regression parameters of length $K$.
* $\sigma^2$ is the variance for the ability distribution.

14楼

tulipsliu

发表于 2020-12-7 13:08:23

Consider the simplest multilevel model for students $i=1, ..., n$ nested within schools $j=1, ..., J$ and for whom we have examination scores as responses. We can write a two-level varying intercept model with no predictors using the usual two-stage formulation as

$$y_{ij} = \alpha_{j} + \epsilon_{ij}, \text{ where } \epsilon_{ij} \sim N(0, \sigma_y^2)$$ $$\alpha_j = \mu_{\alpha} + u_j, \text{ where } u_j \sim N(0, \sigma_\alpha^2)$$

where $y_{ij}$ is the examination score for the *i*th student in the *j*th school, $\alpha_{j}$ is the varying intercept for the *j*th school, and $\mu_{\alpha}$ is the overall mean across schools. Alternatively, the model can be expressed in reduced form as $$y_{ij} = \mu_\alpha + u_j + \epsilon_{ij}.$$. If we further assume that the student-level errors $\epsilon_{ij}$ are normally distributed with mean 0 and variance $\sigma_{y}^{2}$, and that the school-level varying intercepts $\alpha_{j}$ are normally distributed with mean $\mu_{\alpha}$ and variance $\sigma_{\alpha}^{2}$, then the model can be expressed as

$$y_{ij} \sim N(\alpha_{j}, \sigma_{y}^{2}),$$ $$\alpha_{j}\sim N(\mu_{\alpha}, \sigma_{\alpha}^{2}),$$

The varying intercept model^[Equivalently, the model can be expressed using a two-stage formulation as $$y_{ij} = \alpha_j + \beta x_{ij} +\epsilon_{ij},$$ $$\alpha_j = \mu_\alpha + u_j,$$ or in a reduced form as $$y_{ij} = \mu_\alpha + \beta x_{ij} + u_j + \epsilon_{ij}$$ where $\epsilon_{ij} \sim N(0, \sigma_{y}^{2})$ and $u_{j}\sim N(0, \sigma_{\alpha}^{2})$.] with an indicator variable for being female $x_{ij}$ can be written as
$$y_{ij} \sim N(\alpha_{j}+\beta x_{ij} , \sigma_{y}^{2}),$$ $$\alpha_{j}\sim N(\mu_{\alpha}, \sigma_{\alpha}^{2}).$$

15楼

tulipsliu

发表于 2020-12-7 13:09:03

We now extend the varying intercept model with a single predictor to allow both the intercept and the slope to vary randomly across schools using the following model^[Equivalently, the model can be expressed in a two-stage formulation as $$y_{ij} = \alpha_j + \beta_j x_{ij} +\epsilon_{ij},$$ $$\alpha_j = \mu_\alpha + u_j,$$ $$\beta_j = \mu_\beta + v_j,$$ or in a reduced form as $$y_{ij} = \mu_\alpha + \mu_\beta x_{ij} + u_j + v_j x_{ij} + \epsilon_{ij}$$ where $\epsilon_{ij} \sim N(0, \sigma_{y}^{2})$ and $\left( \begin{matrix} u_j \\ v_j \end{matrix} \right) \sim N\left( \left( \begin{matrix} 0 \\ 0 \end{matrix} \right) ,\left( \begin{matrix} { \sigma }_{ \alpha }^{ 2 } & \rho { \sigma }_{ \alpha }{ \sigma }_{ \beta } \\ \rho { \sigma }_{ \alpha }{ \sigma }_{ \beta } & { \sigma }_{ \beta }^{ 2 } \end{matrix} \right) \right)$.]:

$$y_{ij}\sim N(\alpha_{j}+\beta_{j}x_{ij} , \sigma_y ^2 ),$$ $$\left( \begin{matrix} \alpha _{ j } \\ \beta _{ j } \end{matrix} \right) \sim N\left( \left( \begin{matrix} { \mu }_{ \alpha } \\ { \mu }_{ \beta } \end{matrix} \right) , \left( \begin{matrix} { \sigma }_{ \alpha }^{ 2 } & \rho { \sigma }_{ \alpha }{ \sigma }_{ \beta } \\ \rho { \sigma }_{ \alpha }{ \sigma }_{ \beta } & { \sigma }_{ \beta }^{ 2 } \end{matrix} \right) \right).$$

Note that now we have variation in the $\alpha_{j}$'s and the $\beta_{j}$'s, and also a correlation parameter $\rho$ between $\alpha_j$ and $\beta_j$. This model can be fit using `lmer()` as follows:

16楼

tulipsliu

发表于 2020-12-7 13:11:52

$$
\begin{aligned}
\Sigma &=
\left(\begin{matrix}
\sigma_\alpha^2 & \rho\sigma_\alpha \sigma_\beta \\
\rho\sigma_\alpha\sigma_\beta&\sigma_\beta^2
\end{matrix} \right)\\ &=
\sigma_y^2\left(\begin{matrix}
\sigma_\alpha^2/\sigma_y^2 & \rho\sigma_\alpha \sigma_\beta/\sigma_y^2 \\
\rho\sigma_\alpha\sigma_\beta/\sigma_y^2 & \sigma_\beta^2/\sigma_y^2
\end{matrix} \right)\\ &=
\sigma_y^2\left(\begin{matrix}
\sigma_\alpha/\sigma_y & 0 \\
0&\sigma_\beta/\sigma_y
\end{matrix} \right)
\left(\begin{matrix}
1 & \rho\\
\rho&1
\end{matrix} \right)
\left(\begin{matrix}
\sigma_\alpha/\sigma_y & 0 \\
0&\sigma_\beta/\sigma_y
\end{matrix} \right)\\
&= \sigma_y^2VRV.
\end{aligned}
$$

17楼

tulipsliu

发表于 2020-12-7 13:12:15

$$
\left(\begin{matrix}
\sigma_\alpha^2/\sigma_y^2 \\
\sigma_\beta^2/\sigma_y^2
\end{matrix} \right) =
2\left(\frac{\sigma_\alpha^2/\sigma_y^2 + \sigma_\beta^2/\sigma_y^2}{2}\right)\left(\begin{matrix}
\frac{\sigma_\alpha^2/\sigma_y^2}{\sigma_\alpha^2/\sigma_y^2 + \sigma_\beta^2/\sigma_y^2} \\
\frac{\sigma_\beta^2/\sigma_y^2}{\sigma_\alpha^2/\sigma_y^2 + \sigma_\beta^2/\sigma_y^2}
\end{matrix} \right)=
J\tau^2 \pi.
$$

18楼

tulipsliu

发表于 2020-12-7 13:14:16

The PCM [@masters1982rasch] is appropriate for item response data that features more than two *ordered* response categories for some or all items. The items may have differing numbers of response categories. For dichotomous items (items with exactly two response categories), the partial credit model is equivalent to the Rasch model. The version presented includes a latent regression. However, the latent regression part of the model may be restricted to an intercept only, resulting in the standard partial credit model.

$$
\Pr(Y_{ij} = y,~y > 0 | \theta_j, \beta_i) =
\frac{\exp \sum_{s=1}^y (\theta_j - \beta_{is})}
   {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k (\theta_j - \beta_{is})}
$$
$$
\Pr(Y_{ij} = y,~y = 0 | \theta_j, \beta_i) =
\frac{1}
   {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k (\theta_j - \beta_{is})}
$$
$$
\theta_j \sim \mathrm{N}(w_{j}' \lambda, \sigma^2)
$$
Variables:

* $i = 1 \ldots I$ indexes items.
* $j = 1 \ldots J$ indexes persons.
* $Y_{ij} \in \{ 0 \ldots m_i \}$ is the response of person $j$ to item $i$
* $m_i$ is simultaneously the maximum score and number of step difficulty parameters for item $i$.
* $w_{j}$ is the vector of covariates for person $j$, the first element of which *must* equal one for a model intercept. $w_{j}$ may be assembled into a $J$-by-$K$ covariate matrix $W$, where $K$ is number of elements in $w_j$.

Parameters:

* $\beta_{is}$ is the $s$-th step difficulty for item $i$.
* $\theta_j$ is the ability for person $j$.
* $\lambda$ is a vector of latent regression parameters of length $K$.
* $\sigma^2$ is the variance for the ability distribution.
Priors:

* $\sigma \sim \mathrm{Exp}(.1)$ is weakly informative for the person standard deviation.
* $\beta_{is} \sim \mathrm{N}(0, 9)$ is also weakly informative.
* $\lambda \sim t_3(0, 1)$, where $t_3$ is the Student's $t$ distribution with three degrees of freedom, *and* the covariates have been transformed as follows: (1) continuous covariates are mean-centered and then divided by two times their standard deviations, (2) binary covariates are mean-centered and divided their maximum minus minimum values, and (3) no change is made to the constant, set to one, for the model intercept. This approach to setting priors is similar to one that has been suggested for logistic regression [@gelman2008weakly]. It is possible to adjust the  coefficients back to the scales of the original covariates.

19楼

tulipsliu

发表于 2020-12-7 13:15:37

The GPCM [@muraki1992generalized] extends the PCM by including a discrimination term. For dichotomous items (items with exactly two response categories), the generalized partial credit model is equivalent to the two-parameter logistic model. The version presented includes a latent regression. However, the latent regression may be restricted to a model intercept, resulting in the standard generalized partial credit model.

$$
\Pr(Y_{ij} = y,~y > 0 | \theta_j, \alpha_i, \beta_i) =
\frac{\exp \sum_{s=1}^y (\alpha_i \theta_j - \beta_{is})}
   {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k
   (\alpha_i \theta_j - \beta_{is})}
$$
$$
\Pr(Y_{ij} = y,~y = 0 | \theta_j, \alpha_i, \beta_i) =
\frac{1}
   {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k
   (\alpha_i \theta_j - \beta_{is})}
$$

$$
\theta_j \sim \mathrm{N}(w_{j}' \lambda, 1)
$$

Many aspects of the GPCM are similar to the PCM described earlier. Parameters $\beta_i$, $\theta_j$, and $\lambda$ have the same interpretation, but the GPCM adds a discrimination parameter $\alpha_i$ and constrains the variance of $\theta_j$ to one. The prior $\alpha_i \sim \mathrm{log~N}(1, 1)$ is added, which is weakly informative but assumes positive discriminations. The same priors are placed on $\beta_i$ and $\lambda$, and the same constraint is placed on $\beta_I$.

20楼

tulipsliu

发表于 2020-12-7 18:29:52

yestest again

$$
$\frac{\partial V_{ij}}{\partial p_i}=-q+2qp_j>0$ for $p_j>0.5$ and $\frac{\partial Y^*_{ij}}{\partial p_i}=r+q-2qp_j>0$
$$

[学科前沿] 金融学前沿研究R代码实现 [推广有奖]

浏览过的帖子

浏览过的版块

初级热心勋章

本版微信群