大数据时代的市场效率
MARKET EFFICIENCY IN THE AGE OF BIG DATA
作者:
伊恩·马丁(Ian Martin)
斯蒂芬·内格尔(Stefan Nagel)
Modern investors face a high-dimensional prediction problem: thousands of observable variablesare potentially relevant for forecasting. We reassess the conventional wisdom on marketefficiency in light of this fact. In our model economy, which resembles a typical machine learningsetting, N assets have cash flows that are a linear function of J firm characteristics, but withuncertain coefficients. Risk-neutral Bayesian investors impose shrinkage (ridge regression) orsparsity (Lasso) when they estimate the J coefficients of the model and use them to price assets.When J is comparable in size to N, returns appear cross-sectionally predictable using firmcharacteristics to an econometrician who analyzes data from the economy ex post. A factor zooemerges even without p-hacking and data-mining. Standard in-sample tests of market efficiencyreject the no-predictability null with high probability, despite the fact that investors optimally usethe information available to them in real time. In contrast, out-of-sample tests retain theireconomic meaning.
现代投资者面临着高维度的预测问题:成千上万的可观察变量可能与预测相关。有鉴于此,我们重新评估了关于市场效率的传统观点。在类似于典型机器学习设置的模型经济中,N个资产的现金流量是J公司特征的线性函数,但系数不确定。风险中立的贝叶斯投资者在估计模型的J系数并将其用于资产定价时会施加收缩(岭回归)或稀疏性(套索)。当J的大小与N相当时,使用计量经济学家的经济特征来分析收益是横截面可预测的。即使没有p-hacking和数据挖掘,因素动物园也应运而生。尽管投资者最佳地实时使用了可用的信息,但对市场效率的标准样本内测试却极有可能拒绝不可预测的零假设。相反,样本外测试保留其经济意义。



雷达卡


京公网安备 11010802022788号







