发帖

楼主: mingdashike22

371 0

[计算机科学] 论在反应性环境中学习的可能性依赖性 [推广有奖]

0关注
3粉丝

会员

学术权威

78%

还不是VIP/贵宾

-

0%

威望: 10 级
论坛币: 10 个
通用积分: 74.1216
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 24862 点
帖子: 4109
精华: 0
在线时间: 1 小时
注册时间: 2022-2-24
最后登录: 2022-4-15

楼主

mingdashike22

发表于 2022-3-4 22:58:30 来自手机 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

摘要翻译：
我们解决了强化学习的问题，在这个问题中，观察可能会对过去的观察和行为表现出任意形式的随机依赖，即比(PO)MDPs更普遍的环境。agent的任务是在真实生成环境未知但属于一个已知的可数环境族的情况下，获得可能的最佳渐近报酬。我们给出了agent存在的环境类的一些充分条件，使得agent对该类中的任何环境都能获得最佳的渐近报酬。我们分析了这些条件有多紧密，以及它们如何与强化学习和相关领域中已知的不同概率假设相关，如马尔可夫决策过程和混合条件。
---
英文标题：
《On the Possibility of Learning in Reactive Environments with Arbitrary
Dependence》
---
作者：
Daniil Ryabko and Marcus Hutter
---
最新提交年份：
2008
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Machine Learning 机器学习
分类描述：Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文（有监督的，无监督的，强化学习，强盗问题，等等），包括健壮性，解释性，公平性和方法论。对于机器学习方法的应用，CS.LG也是一个合适的主要类别。
--
一级分类：Computer Science 计算机科学
二级分类：Artificial Intelligence 人工智能
分类描述：Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域，除了视觉、机器人、机器学习、多智能体系统以及计算和语言（自然语言处理），这些领域有独立的学科领域。特别地，包括专家系统，定理证明（尽管这可能与计算机科学中的逻辑重叠），知识表示，规划，和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类：Computer Science 计算机科学
二级分类：Information Theory 信息论
分类描述：Covers theoretical and experimental aspects of information theory and coding. Includes material in ACM Subject Class E.4 and intersects with H.1.1.
涵盖信息论和编码的理论和实验方面。包括ACM学科类E.4中的材料，并与H.1.1有交集。
--
一级分类：Mathematics 数学
二级分类：Information Theory 信息论
分类描述：math.IT is an alias for cs.IT. Covers theoretical and experimental aspects of information theory and coding.
它是cs.it的别名。涵盖信息论和编码的理论和实验方面。
--

---
英文摘要：
We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.
---
PDF链接：
https://arxiv.org/pdf/0810.5636

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：可能性依赖性学习的 Environments Experimental 环境报酬形式领域 observations

[计算机科学] 论在反应性环境中学习的可能性依赖性 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

[计算机科学] 论在反应性环境中学习的可能性 依赖性 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

[计算机科学] 论在反应性环境中学习的可能性依赖性 [推广有奖]

扫码加我拉你入群