发帖

楼主: 何人来此

344 0

[计算机科学] 利用变分推理和MapReduce进行主题缩放建模 [推广有奖]

0关注
4粉丝

会员

学术权威

78%

还不是VIP/贵宾

-

0%

威望: 10 级
论坛币: 10 个
通用积分: 64.9212
学术水平: 1 点
热心指数: 6 点
信用等级: 0 点
经验: 24593 点
帖子: 4128
精华: 0
在线时间: 0 小时
注册时间: 2022-2-24
最后登录: 2022-4-15

楼主

何人来此

发表于 2022-4-12 08:15:00 来自手机 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

摘要翻译：
潜在Dirichlet分配(LDA)是研究文档集合的一种流行的主题建模技术。由于大规模数据集的日益普遍，需要提高LDA推理的可扩展性。本文提出了一种在MapReduce框架中容纳大量语料库的方法&~\emph{MapReduce LDA}（Mr.LDA）。与其他LDA缩放推理技术使用吉布斯抽样相比，我们使用变分推理。我们的解决方案有效地分配了计算，并且实现相对简单。更重要的是，与高度调优和专门化的实现不同，这种可变的实现很容易扩展。我们用这个可扩展的框架对模型进行了两个扩展：引导主题发现的先验信息和从多语言语料库中对主题进行建模。
---
英文标题：
《Using Variational Inference and MapReduce to Scale Topic Modeling》
---
作者：
Ke Zhai, Jordan Boyd-Graber, and Nima Asadi
---
最新提交年份：
2011
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Artificial Intelligence 人工智能
分类描述：Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域，除了视觉、机器人、机器学习、多智能体系统以及计算和语言（自然语言处理），这些领域有独立的学科领域。特别地，包括专家系统，定理证明（尽管这可能与计算机科学中的逻辑重叠），知识表示，规划，和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类：Computer Science 计算机科学
二级分类：Distributed, Parallel, and Cluster Computing 分布式、并行和集群计算
分类描述：Covers fault-tolerance, distributed algorithms, stabilility, parallel computation, and cluster computing. Roughly includes material in ACM Subject Classes C.1.2, C.1.4, C.2.4, D.1.3, D.4.5, D.4.7, E.1.
包括容错、分布式算法、稳定性、并行计算和集群计算。大致包括ACM学科类C.1.2、C.1.4、C.2.4、D.1.3、D.4.5、D.4.7、E.1中的材料。
--

---
英文摘要：
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring document collections. Because of the increasing prevalence of large datasets, there is a need to improve the scalability of inference of LDA. In this paper, we propose a technique called ~\emph{MapReduce LDA} (Mr. LDA) to accommodate very large corpus collections in the MapReduce framework. In contrast to other techniques to scale inference for LDA, which use Gibbs sampling, we use variational inference. Our solution efficiently distributes computation and is relatively simple to implement. More importantly, this variational implementation, unlike highly tuned and specialized implementations, is easily extensible. We demonstrate two extensions of the model possible with this scalable framework: informed priors to guide topic discovery and modeling topics from a multilingual corpus.
---
PDF链接：
https://arxiv.org/pdf/1107.3765

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：MapReduce reduce Pred Map Apr 缩放进行 Dirichlet 使用 topic

[计算机科学] 利用变分推理和MapReduce进行主题缩放建模 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

[计算机科学] 利用变分推理和MapReduce进行主题缩放建模 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群