楼主: 能者818
392 0

[统计数据] 压缩回归 [推广有奖]

  • 0关注
  • 6粉丝

会员

学术权威

78%

还不是VIP/贵宾

-

威望
10
论坛币
10 个
通用积分
39.5040
学术水平
0 点
热心指数
1 点
信用等级
0 点
经验
24699 点
帖子
4115
精华
0
在线时间
1 小时
注册时间
2022-2-24
最后登录
2024-12-24

楼主
能者818 在职认证  发表于 2022-3-3 20:43:00 来自手机 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
摘要翻译:
最近的研究已经研究了稀疏性在高维回归和信号重建中的作用,建立了从稀疏数据中恢复稀疏模型的理论极限。这一工作表明,$\ell_1$-正则化最小二乘回归可以从$P$维的$N$噪声示例中准确估计稀疏线性模型,即使$P$比$N$大得多。本文研究了该问题的一种变体,即用随机线性变换将原始的$n$输入变量压缩到$p$维的$m\lln$示例,并建立了从压缩数据中成功恢复稀疏线性模型的条件。这种压缩过程的主要动机是匿名数据,并通过透露关于原始数据的少量信息来保护隐私。我们刻划了$ell_1$-正则化压缩回归所需的随机投影数,以识别概率接近1的真模型中的非零系数,这是一个称为“稀疏性”的性质。此外,我们还证明了$\ell_1$-正则化压缩回归的渐近预测性以及oracle线性模型的持久性。最后,我们用信息论的术语刻画了压缩过程的隐私性质,建立了压缩数据和未压缩数据之间的互信息衰减到零的上界。
---
英文标题:
《Compressed Regression》
---
作者:
Shuheng Zhou, John Lafferty, Larry Wasserman
---
最新提交年份:
2008
---
分类信息:

一级分类:Statistics        统计学
二级分类:Machine Learning        机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
一级分类:Computer Science        计算机科学
二级分类:Information Theory        信息论
分类描述:Covers theoretical and experimental aspects of information theory and coding. Includes material in ACM Subject Class E.4 and intersects with H.1.1.
涵盖信息论和编码的理论和实验方面。包括ACM学科类E.4中的材料,并与H.1.1有交集。
--
一级分类:Mathematics        数学
二级分类:Information Theory        信息论
分类描述:math.IT is an alias for cs.IT. Covers theoretical and experimental aspects of information theory and coding.
它是cs.it的别名。涵盖信息论和编码的理论和实验方面。
--

---
英文摘要:
  Recent research has studied the role of sparsity in high dimensional regression and signal reconstruction, establishing theoretical limits for recovering sparse models from sparse data. This line of work shows that $\ell_1$-regularized least squares regression can accurately estimate a sparse linear model from $n$ noisy examples in $p$ dimensions, even if $p$ is much larger than $n$. In this paper we study a variant of this problem where the original $n$ input variables are compressed by a random linear transformation to $m \ll n$ examples in $p$ dimensions, and establish conditions under which a sparse linear model can be successfully recovered from the compressed data. A primary motivation for this compression procedure is to anonymize the data and preserve privacy by revealing little information about the original data. We characterize the number of random projections that are required for $\ell_1$-regularized compressed regression to identify the nonzero coefficients in the true model with probability approaching one, a property called ``sparsistence.'' In addition, we show that $\ell_1$-regularized compressed regression asymptotically predicts as well as an oracle linear model, a property called ``persistence.'' Finally, we characterize the privacy properties of the compression procedure in information-theoretic terms, establishing upper bounds on the mutual information between the compressed and uncompressed data that decay to zero.
---
PDF链接:
https://arxiv.org/pdf/706.0534
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Experimental establishing coefficients Successfully Construction 术语 regularized 线性 正则 研究

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
jg-xs1
拉您进交流群
GMT+8, 2026-1-7 08:48