摘要翻译:
虽然许多现实世界的随机规划问题更自然地由离散变量和连续变量的混合模型来描述,但目前最先进的方法不能充分解决这些问题。我们提出了第一个能够利用问题结构来高效地建模和求解混合问题的框架。我们将这些问题描述为混合马尔可夫决策过程(具有连续和离散的状态和动作变量的MDPs),我们假设可以用混合动态贝叶斯网络(hybrid DBN)以因子的方式表示。这个公式还允许我们将我们的方法应用到协作的多智能体设置中。我们提出了一种新的线性规划逼近方法,它利用了混合MDP的结构,使我们能够更有效地计算近似值函数。特别地,我们描述了一种新的连续变量因子离散化方法,避免了传统方法的指数爆破。我们给出了这种近似的质量和它的放大势的理论界限。我们通过对一组具有28维连续状态空间和22维作用空间的控制问题的实验来支持我们的理论论点。
---
英文标题:
《Solving Factored MDPs with Continuous and Discrete Variables》
---
作者:
Carlos E. Guestrin, Milos Hauskrecht, Branislav Kveton
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods cannot adequately address these problems. We present the first framework that can exploit problem structure for modeling and solving hybrid problems efficiently. We formulate these problems as hybrid Markov decision processes (MDPs with continuous and discrete state and action variables), which we assume can be represented in a factored way using a hybrid dynamic Bayesian network (hybrid DBN). This formulation also allows us to apply our methods to collaborative multiagent settings. We present a new linear program approximation method that exploits the structure of the hybrid MDP and lets us compute approximate value functions more efficiently. In particular, we describe a new factored discretization of continuous variables that avoids the exponential blow-up of traditional approaches. We provide theoretical bounds on the quality of such an approximation and on its scale-up potential. We support our theoretical arguments with experiments on a set of control problems with up to 28-dimensional continuous state space and 22-dimensional action space.
---
PDF链接:
https://arxiv.org/pdf/1207.4150


雷达卡



京公网安备 11010802022788号







