摘要翻译:
本文考虑了存在多个控制的广义线性模型。在建立一个免疫模型选择错误的工具的基础上,我们提出了一种估计兴趣效应的一般方法,并将其应用于logistic二元选择模型的情况。更具体地说,我们提出了新的方法来估计和构造一个主要感兴趣的回归参数$\alpha_0$,一个在感兴趣的回归子前面的参数,如治疗变量或策略变量。使用稀疏性假设,当其他回归子(称为对照)的总数$p$可能超过样本量$n$时,这些方法允许以根-$n$速率估计$\alpha_0$。稀疏性假设意味着存在一个$S<N$控件的子集,它足以精确地逼近回归函数中令人讨厌的部分。重要的是,在满足$S^2\log^2p=o(n)$和其他技术条件的$S$-稀疏模型上,估计量和这些置信域是一致有效的。这些过程的有效性不依赖于传统的一致模型选择参数。事实上,它们对于变量选择中的适度模型选择错误是稳健的。在适当的条件下,估计量是半参数有效的,从而得到了这类模型的半参数有效界。
---
英文标题:
《Post-Selection Inference for Generalized Linear Models with Many
Controls》
---
作者:
Alexandre Belloni and Victor Chernozhukov and Ying Wei
---
最新提交年份:
2016
---
分类信息:
一级分类:Statistics 统计学
二级分类:Methodology 方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Economics 经济学
二级分类:Econometrics 计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--
一级分类:Mathematics 数学
二级分类:Statistics Theory 统计理论
分类描述:Applied, computational and theoretical statistics: e.g. statistical inference, regression, time series, multivariate analysis, data analysis, Markov chain Monte Carlo, design of experiments, case studies
应用统计、计算统计和理论统计:例如统计推断、回归、时间序列、多元分析、数据分析、马尔可夫链蒙特卡罗、实验设计、案例研究
--
一级分类:Statistics 统计学
二级分类:Statistics Theory 统计理论
分类描述:stat.TH is an alias for math.ST. Asymptotics, Bayesian Inference, Decision Theory, Estimation, Foundations, Inference, Testing.
Stat.Th是Math.St的别名。渐近,贝叶斯推论,决策理论,估计,基础,推论,检验。
--
---
英文摘要:
This paper considers generalized linear models in the presence of many controls. We lay out a general methodology to estimate an effect of interest based on the construction of an instrument that immunize against model selection mistakes and apply it to the case of logistic binary choice model. More specifically we propose new methods for estimating and constructing confidence regions for a regression parameter of primary interest $\alpha_0$, a parameter in front of the regressor of interest, such as the treatment variable or a policy variable. These methods allow to estimate $\alpha_0$ at the root-$n$ rate when the total number $p$ of other regressors, called controls, potentially exceed the sample size $n$ using sparsity assumptions. The sparsity assumption means that there is a subset of $s<n$ controls which suffices to accurately approximate the nuisance part of the regression function. Importantly, the estimators and these resulting confidence regions are valid uniformly over $s$-sparse models satisfying $s^2\log^2 p = o(n)$ and other technical conditions. These procedures do not rely on traditional consistent model selection arguments for their validity. In fact, they are robust with respect to moderate model selection mistakes in variable selection. Under suitable conditions, the estimators are semi-parametrically efficient in the sense of attaining the semi-parametric efficiency bounds for the class of models in this paper.
---
PDF链接:
https://arxiv.org/pdf/1304.3969