英文标题:
《The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and
Option Portfolios》
---
作者:
Igor Halperin
---
最新提交年份:
2018
---
英文摘要:
The QLBS model is a discrete-time option hedging and pricing model that is based on Dynamic Programming (DP) and Reinforcement Learning (RL). It combines the famous Q-Learning method for RL with the Black-Scholes (-Merton) model\'s idea of reducing the problem of option pricing and hedging to the problem of optimal rebalancing of a dynamic replicating portfolio for the option, which is made of a stock and cash. Here we expand on several NuQLear (Numerical Q-Learning) topics with the QLBS model. First, we investigate the performance of Fitted Q Iteration for a RL (data-driven) solution to the model, and benchmark it versus a DP (model-based) solution, as well as versus the BSM model. Second, we develop an Inverse Reinforcement Learning (IRL) setting for the model, where we only observe prices and actions (re-hedges) taken by a trader, but not rewards. Third, we outline how the QLBS model can be used for pricing portfolios of options, rather than a single option in isolation, thus providing its own, data-driven and model independent solution to the (in)famous volatility smile problem of the Black-Scholes model.
---
中文摘要:
QLBS模型是一种基于动态规划(DP)和强化学习(RL)的离散时间期权套期保值和定价模型。它结合了著名的RL Q学习方法和Black-Scholes(-Merton)模型的思想,将期权定价和套期保值问题简化为由股票和现金组成的期权的动态复制投资组合的最优再平衡问题。在这里,我们使用QLBS模型扩展了几个NuQLear(数值Q学习)主题。首先,我们研究了模型RL(数据驱动)解决方案的拟合Q迭代的性能,并将其与DP(基于模型)解决方案以及BSM模型进行比较。其次,我们为模型开发了一个反向强化学习(IRL)设置,在该设置中,我们只观察交易者采取的价格和行为(重新对冲),而不观察回报。第三,我们概述了QLBS模型如何用于期权组合定价,而不是孤立的单个期权,从而为Black-Scholes模型著名的波动率微笑问题提供了自己的、数据驱动的和模型独立的解决方案。
---
分类信息:
一级分类:Quantitative Finance 数量金融学
二级分类:Computational Finance 计算金融学
分类描述:Computational methods, including Monte Carlo, PDE, lattice and other numerical methods with applications to financial modeling
计算方法,包括蒙特卡罗,偏微分方程,格子和其他数值方法,并应用于金融建模
--
一级分类:Computer Science 计算机科学
二级分类:Machine Learning 机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
---
PDF下载:
-->