| 所在主题: | |
| 文件名: Model_selection_consistency_from_the_perspective_of_generalization_ability_and_V.pdf | |
| 资料下载链接地址: https://bbs.pinggu.org/a-3689701.html | |
| 附件大小: | |
|
英文标题:
《Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso》 --- 作者: Ning Xu, Jian Hong, Timothy C.G. Fisher --- 最新提交年份: 2016 --- 英文摘要: Model selection is difficult to analyse yet theoretically and empirically important, especially for high-dimensional data analysis. Recently the least absolute shrinkage and selection operator (Lasso) has been applied in the statistical and econometric literature. Consis- tency of Lasso has been established under various conditions, some of which are difficult to verify in practice. In this paper, we study model selection from the perspective of generalization ability, under the framework of structural risk minimization (SRM) and Vapnik-Chervonenkis (VC) theory. The approach emphasizes the balance between the in-sample and out-of-sample fit, which can be achieved by using cross-validation to select a penalty on model complexity. We show that an exact relationship exists between the generalization ability of a model and model selection consistency. By implementing SRM and the VC inequality, we show that Lasso is L2-consistent for model selection under assumptions similar to those imposed on OLS. Furthermore, we derive a probabilistic bound for the distance between the penalized extremum estimator and the extremum estimator without penalty, which is dominated by overfitting. We also propose a new measurement of overfitting, GR2, based on generalization ability, that converges to zero if model selection is consistent. Using simulations, we demonstrate that the proposed CV-Lasso algorithm performs well in terms of model selection and overfitting control. --- 中文摘要: 模型选择很难分析,但在理论和经验上都很重要,尤其是对于高维数据分析。最近,最小绝对收缩和选择算子(Lasso)已应用于统计和计量经济学文献中。套索的一致性已在各种条件下建立,其中一些条件在实践中难以验证。本文在结构风险最小化(SRM)和Vapnik-Chervonenkis(VC)理论的框架下,从泛化能力的角度研究了模型选择问题。该方法强调样本内拟合和样本外拟合之间的平衡,这可以通过使用交叉验证来选择对模型复杂性的惩罚来实现。我们证明了模型的泛化能力与模型选择一致性之间存在着精确的关系。通过实现SRM和VC不等式,我们证明了在类似于OLS的假设下,Lasso对于模型选择是L2一致的。此外,我们还推导了惩罚极值估计量与无惩罚极值估计量之间的距离的概率界,该界主要由过拟合决定。我们还提出了一种基于泛化能力的新的过拟合度量GR2,如果模型选择一致,该度量将收敛到零。通过仿真,我们证明了所提出的CV-Lasso算法在模型选择和过拟合控制方面表现良好。 --- 分类信息: 一级分类:Statistics 统计学 二级分类:Machine Learning 机器学习 分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding 覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础 -- 一级分类:Quantitative Finance 数量金融学 二级分类:Economics 经济学 分类描述:q-fin.EC is an alias for econ.GN. Economics, including micro and macro economics, international economics, theory of the firm, labor economics, and other economic topics outside finance q-fin.ec是econ.gn的别名。经济学,包括微观和宏观经济学、国际经济学、企业理论、劳动经济学和其他金融以外的经济专题 -- 一级分类:Statistics 统计学 二级分类:Computation 计算 分类描述:Algorithms, Simulation, Visualization 算法、模拟、可视化 -- --- PDF下载: --> |
|
熟悉论坛请点击新手指南
|
|
| 下载说明 | |
|
1、论坛支持迅雷和网际快车等p2p多线程软件下载,请在上面选择下载通道单击右健下载即可。 2、论坛会定期自动批量更新下载地址,所以请不要浪费时间盗链论坛资源,盗链地址会很快失效。 3、本站为非盈利性质的学术交流网站,鼓励和保护原创作品,拒绝未经版权人许可的上传行为。本站如接到版权人发出的合格侵权通知,将积极的采取必要措施;同时,本站也将在技术手段和能力范围内,履行版权保护的注意义务。 (如有侵权,欢迎举报) |
|
京ICP备16021002号-2 京B2-20170662号
京公网安备 11010802022788号
论坛法律顾问:王进律师
知识产权保护声明
免责及隐私声明