摘要翻译:
\textit{Binscatter},或binned散点图,是应用微观经济学中非常流行的工具。它提供了一种灵活而又节省的方法来可视化和总结大型数据集中的均值、分位数和其他非参数回归函数。它也经常被用来非正式地评价实质性假设,如未知函数的线性或单调性。本文介绍了binscatter的基础计量分析,提供了一系列理论和实践结果,有助于理解当前的实践(即它们的有效性或缺乏性)以及指导未来的应用。特别是,我们强调了与当前实践中使用的协变量调整方法有关的重要方法学问题,并提供了一个简单、有效的方法。我们的结果包括对bins数目、置信区间和频带的原则选择,对均值、分位数和其他感兴趣的函数的参数和形状限制的假设检验,以及其他新方法,所有这些方法都适用于规范binscatter以及非线性、高阶多项式、光滑性限制和协变量调整的扩展。提供了\texttt{Python}、\texttt{R}和\texttt{Stata}的配套通用软件包。从技术的角度,我们提出了新的理论结果,可能非线性半参数划分为基础的序列估计随机划分是独立感兴趣的。
---
英文标题:
《On Binscatter》
---
作者:
Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng
---
最新提交年份:
2021
---
分类信息:
一级分类:Economics 经济学
二级分类:Econometrics 计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--
一级分类:Statistics 统计学
二级分类:Methodology 方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Statistics 统计学
二级分类:Machine Learning 机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
---
英文摘要:
\textit{Binscatter}, or a binned scatter plot, is a very popular tool in applied microeconomics. It provides a flexible, yet parsimonious way of visualizing and summarizing mean, quantile, and other nonparametric regression functions in large data sets. It is also often used for informal evaluation of substantive hypotheses such as linearity or monotonicity of the unknown function. This paper presents a foundational econometric analysis of binscatter, offering an array of theoretical and practical results that aid both understanding current practices (i.e., their validity or lack thereof) as well as guiding future applications. In particular, we highlight important methodological problems related to covariate adjustment methods used in current practice, and provide a simple, valid approach. Our results include a principled choice for the number of bins, confidence intervals and bands, hypothesis tests for parametric and shape restrictions for mean, quantile, and other functions of interest, among other new methods, all applicable to canonical binscatter as well as to nonlinear, higher-order polynomial, smoothness-restricted and covariate-adjusted extensions thereof. Companion general-purpose software packages for \texttt{Python}, \texttt{R}, and \texttt{Stata} are provided. From a technical perspective, we present novel theoretical results for possibly nonlinear semi-parametric partitioning-based series estimation with random partitions that are of independent interest.
---
PDF链接:
https://arxiv.org/pdf/1902.09608