摘要翻译:
从数据中学习离散贝叶斯网络的问题被编码为加权MAX-SAT问题,并使用MaxWalkSat局部搜索算法来解决。对于每个数据集,在应用MaxWalkSAT之前,计算不同父母选择(“家庭分数”)的(BDeu)边际似然的每个变量总和。每个变量的每个允许的父母选择被编码为一个独立的命题原子,相关的家庭得分被编码为一个“软”加权的单字面子句。我们考虑了两种实现非循环性的方法:要么通过编码祖先关系,要么通过在每个图上附加一个总顺序并对其进行编码。后一种方法给出了更好的结果。在从7个BNS中抽取的21个合成数据集上进行了学习实验。最大的数据集有10,000个数据点和60个变量(对于“祖先”编码),生成一个加权CNF输入文件,包含19,932个原子和269,367个子句。对于大多数数据集,MaxWalkSat可以快速找到BDeu得分高于“真实”BN的BN。对增加先验信息的效果进行了评估。进一步表明,通过收集搜索过程中生成的BNs可以实现贝叶斯模型平均。
---
英文标题:
《Bayesian network learning by compiling to weighted MAX-SAT》
---
作者:
James Cussens
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
The problem of learning discrete Bayesian networks from data is encoded as a weighted MAX-SAT problem and the MaxWalkSat local search algorithm is used to address it. For each dataset, the per-variable summands of the (BDeu) marginal likelihood for different choices of parents ('family scores') are computed prior to applying MaxWalkSat. Each permissible choice of parents for each variable is encoded as a distinct propositional atom and the associated family score encoded as a 'soft' weighted single-literal clause. Two approaches to enforcing acyclicity are considered: either by encoding the ancestor relation or by attaching a total order to each graph and encoding that. The latter approach gives better results. Learning experiments have been conducted on 21 synthetic datasets sampled from 7 BNs. The largest dataset has 10,000 datapoints and 60 variables producing (for the 'ancestor' encoding) a weighted CNF input file with 19,932 atoms and 269,367 clauses. For most datasets, MaxWalkSat quickly finds BNs with higher BDeu score than the 'true' BN. The effect of adding prior information is assessed. It is further shown that Bayesian model averaging can be effected by collecting BNs generated during the search.
---
PDF链接:
https://arxiv.org/pdf/1206.3244


雷达卡



京公网安备 11010802022788号







