楼主: 何人来此
391 0

[计算机科学] 多任务特征选择的最小描述长度方法 [推广有奖]

  • 0关注
  • 4粉丝

会员

学术权威

79%

还不是VIP/贵宾

-

威望
10
论坛币
10 个
通用积分
62.5554
学术水平
1 点
热心指数
6 点
信用等级
0 点
经验
24791 点
帖子
4194
精华
0
在线时间
0 小时
注册时间
2022-2-24
最后登录
2022-4-15

楼主
何人来此 在职认证  发表于 2022-3-7 10:28:00 来自手机 |只看作者 |坛友微信交流群|倒序 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
摘要翻译:
许多回归问题涉及的不是一个而是几个响应变量。通常,这些答复被怀疑具有共同的基础结构,在这种情况下,在它们之间共享信息可能是有利的;这就是所谓的多任务学习。作为一个特例,我们可以使用多个响应来更好地识别共享的预测特征--这个项目我们可以称之为多任务特征选择。本论文组织如下。第1节介绍回归的特征选择,重点讨论ell_0正则化方法及其在最小描述长度(MDL)框架内的解释。第二节提出了一种新的MDL特征选择扩展到多任务设置。这种方法被称为“多重包含标准”(MIC),旨在通过更容易地选择与多重响应相关的特征来跨回归任务借用信息。我们在合成的和真实的生物数据集上的实验表明,MIC可以在特征在响应之间至少部分共享的情况下减少预测误差。第3节用一个单一反应的回归来调查假设检验,重点是标准的Bonferroni校正和MDL方法之间的平行性。第四节反映了第二节的思想,提出了一种新的MIC方法,用于多响应假设检验,并表明在多响应特征显著共享的合成数据上,MIC有时在寻找给定假阳性水平的真阳性方面优于标准的FDR控制方法。第5节总结。
---
英文标题:
《A Minimum Description Length Approach to Multitask Feature Selection》
---
作者:
Brian Tomasik
---
最新提交年份:
2009
---
分类信息:

一级分类:Computer Science        计算机科学
二级分类:Machine Learning        机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Computer Science        计算机科学
二级分类:Artificial Intelligence        人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--

---
英文摘要:
  Many regression problems involve not one but several response variables (y's). Often the responses are suspected to share a common underlying structure, in which case it may be advantageous to share information across them; this is known as multitask learning. As a special case, we can use multiple responses to better identify shared predictive features -- a project we might call multitask feature selection.   This thesis is organized as follows. Section 1 introduces feature selection for regression, focusing on ell_0 regularization methods and their interpretation within a Minimum Description Length (MDL) framework. Section 2 proposes a novel extension of MDL feature selection to the multitask setting. The approach, called the "Multiple Inclusion Criterion" (MIC), is designed to borrow information across regression tasks by more easily selecting features that are associated with multiple responses. We show in experiments on synthetic and real biological data sets that MIC can reduce prediction error in settings where features are at least partially shared across responses. Section 3 surveys hypothesis testing by regression with a single response, focusing on the parallel between the standard Bonferroni correction and an MDL approach. Mirroring the ideas in Section 2, Section 4 proposes a novel MIC approach to hypothesis testing with multiple responses and shows that on synthetic data with significant sharing of features across responses, MIC sometimes outperforms standard FDR-controlling methods in terms of finding true positives for a given level of false positives. Section 5 concludes.
---
PDF链接:
https://arxiv.org/pdf/0906.0052
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:特征选择 Intelligence Presentation Applications information 描述 approach 特征 feature 数据

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加JingGuanBbs
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-10-6 19:13