摘要翻译:
反事实后悔最小化(CFR)算法是一种有效的无后悔学习算法,用于求解广泛博弈的决策问题。CFR的遗憾范围取决于完美回忆的要求:玩家总是记得向他们透露的信息以及透露的顺序。然而,在没有完全召回的游戏中,CFR的保证不适用。本文给出了CFR应用于一类不完全回忆对策时的第一遗憾界。此外,我们还证明了CFR应用于任何属于我们一般类的抽象,不仅对于抽象博弈,而且对于整个博弈,都会导致遗憾的束缚。我们验证了我们的理论,并展示了在三个领域中,不完美回忆是如何被用来用后悔的少量增加来换取记忆的显著减少的:死滚扑克、幻影井字游戏和虚张声势。
---
英文标题:
《No-Regret Learning in Extensive-Form Games with Imperfect Recall》
---
作者:
Marc Lanctot, Richard Gibson, Neil Burch, Martin Zinkevich, and
Michael Bowling
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Computer Science and Game Theory 计算机科学与博弈论
分类描述:Covers all theoretical and applied aspects at the intersection of computer science and game theory, including work in mechanism design, learning in games (which may overlap with Learning), foundations of agent modeling in games (which may overlap with Multiagent systems), coordination, specification and formal methods for non-cooperative computational environments. The area also deals with applications of game theory to areas such as electronic commerce.
涵盖计算机科学和博弈论交叉的所有理论和应用方面,包括机制设计的工作,游戏中的学习(可能与学习重叠),游戏中的agent建模的基础(可能与多agent系统重叠),非合作计算环境的协调、规范和形式化方法。该领域还涉及博弈论在电子商务等领域的应用。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
Counterfactual Regret Minimization (CFR) is an efficient no-regret learning algorithm for decision problems modeled as extensive games. CFR's regret bounds depend on the requirement of perfect recall: players always remember information that was revealed to them and the order in which it was revealed. In games without perfect recall, however, CFR's guarantees do not apply. In this paper, we present the first regret bound for CFR when applied to a general class of games with imperfect recall. In addition, we show that CFR applied to any abstraction belonging to our general class results in a regret bound not just for the abstract game, but for the full game as well. We verify our theory and show how imperfect recall can be used to trade a small increase in regret for a significant reduction in memory in three domains: die-roll poker, phantom tic-tac-toe, and Bluff.
---
PDF链接:
https://arxiv.org/pdf/1205.0622


雷达卡



京公网安备 11010802022788号







