楼主: mingdashike22
400 0

[统计数据] 软约束亲和传播聚类技术在聚类中的应用 基因表达数据 [推广有奖]

  • 0关注
  • 3粉丝

会员

学术权威

78%

还不是VIP/贵宾

-

威望
10
论坛币
10 个
通用积分
73.7616
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
24862 点
帖子
4109
精华
0
在线时间
1 小时
注册时间
2022-2-24
最后登录
2022-4-15

楼主
mingdashike22 在职认证  发表于 2022-3-7 10:17:50 来自手机 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
摘要翻译:
动机:基于相似性度量的聚类是科学数据分析中的一个关键问题。最近,Frey和Dueck\cite提出了一种强大的基于消息传递技术的新算法&亲和传播(AP)。在AP中,每个聚类由一个共同的样例来标识,同一聚类中的所有其他数据点都引用,样例必须引用自己。尽管它被证明是强大的,但美联社目前的形式存在许多缺陷。每个聚类只有一个样本的硬约束将AP限制在规则形状的聚类中,并导致在分析基因表达数据时的次优性能(例如)。结果:放宽AP硬约束可以克服这一局限性。一个新的参数控制约束的重要性与最大化整体相似度的目标相比,并允许在每个数据点选择其最近邻作为样本的简单情况和原始AP之间进行插值。由此产生的软约束亲和传播(SCAP)变得更有信息量,更准确,并导致更稳定的聚类。尽管引入了新的{It a priori}自由参数,但由于鲁棒性增强和参数选择的最优策略更自然地出现,算法对外部调整的总体依赖性降低。SCAP是在生物基准数据上测试的,特别是包括与各种癌症类型相关的微阵列数据。结果表明,该算法有效地揭示了数据集中存在的层次聚类结构。更进一步,它允许为每个簇提取稀疏的基因表达特征。
---
英文标题:
《Clustering by soft-constraint affinity propagation: Applications to
  gene-expression data》
---
作者:
Michele Leone, Sumedha, Martin Weigt
---
最新提交年份:
2007
---
分类信息:

一级分类:Quantitative Biology        数量生物学
二级分类:Quantitative Methods        定量方法
分类描述:All experimental, numerical, statistical and mathematical contributions of value to biology
对生物学价值的所有实验、数值、统计和数学贡献
--
一级分类:Physics        物理学
二级分类:Statistical Mechanics        统计力学
分类描述:Phase transitions, thermodynamics, field theory, non-equilibrium phenomena, renormalization group and scaling, integrable models, turbulence
相变,热力学,场论,非平衡现象,重整化群和标度,可积模型,湍流
--
一级分类:Physics        物理学
二级分类:Data Analysis, Statistics and Probability        数据分析、统计与概率
分类描述:Methods, software and hardware for physics data analysis: data processing and storage; measurement methodology; statistical and mathematical aspects such as parametrization and uncertainties.
物理数据分析的方法、软硬件:数据处理与存储;测量方法;统计和数学方面,如参数化和不确定性。
--

---
英文摘要:
  Motivation: Similarity-measure based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck \cite{Frey07}. In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, {\it e.g.}, in analyzing gene expression data. Results: This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new {\it a priori} free-parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.
---
PDF链接:
https://arxiv.org/pdf/705.2646
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:基因表达 软约束 Mathematical Quantitative Contribution 聚类 表达 new 调整 data

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
jg-xs1
拉您进交流群
GMT+8, 2025-12-6 05:47