摘要翻译:
政治言论的日益数字化为利用文本分析研究政治行为的新维度打开了大门。这项工作调查了来自美国国会记录的字级统计数据对研究参议员的思想立场和行为的价值,该记录包含美国国会所有演讲的全文。应用机器学习技术,我们利用这些数据自动按照政党对参议员进行分类,根据使用的具体方法,获得70-95%的准确率。我们还表明,使用文本来预测DW-提名分数,一种意识形态的常见代理,并不能改善这些已经成功的结果。当应用于从训练集中删除四年或更长时间的国会会议的文本时,这种分类会恶化,这表明选民需要动态更新他们用于根据政治言论评估政党的启发式。基于文本的预测不如基于投票行为的预测准确,这支持了唱名投票代表政治家更大承诺的理论,因此更准确地反映了他们的意识形态偏好。然而,这里研究的机器学习方法的总体成功表明,政治演讲对党派归属具有高度的预测性。除了这些发现,本工作还介绍了与政治演讲数据使用相关的计算工具和方法。
---
英文标题:
《"Read My Lips": Using Automatic Text Analysis to Classify Politicians by
Party and Ideology》
---
作者:
Eitan Sapiro-Gheiler
---
最新提交年份:
2018
---
分类信息:
一级分类:Economics 经济学
二级分类:General Economics 一般经济学
分类描述:General methodological, applied, and empirical contributions to economics.
对经济学的一般方法、应用和经验贡献。
--
一级分类:Computer Science 计算机科学
二级分类:Computation and Language 计算与语言
分类描述:Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.
涵盖自然语言处理。大致包括ACM科目I.2.7类的材料。请注意,人工语言(编程语言、逻辑学、形式系统)的工作,如果没有明确地解决广义的自然语言问题(自然语言处理、计算语言学、语音、文本检索等),就不适合这个领域。
--
一级分类:Quantitative Finance 数量金融学
二级分类:Economics 经济学
分类描述:q-fin.EC is an alias for econ.GN. Economics, including micro and macro economics, international economics, theory of the firm, labor economics, and other economic topics outside finance
q-fin.ec是econ.gn的别名。经济学,包括微观和宏观经济学、国际经济学、企业理论、劳动经济学和其他金融以外的经济专题
--
---
英文摘要:
The increasing digitization of political speech has opened the door to studying a new dimension of political behavior using text analysis. This work investigates the value of word-level statistical data from the US Congressional Record--which contains the full text of all speeches made in the US Congress--for studying the ideological positions and behavior of senators. Applying machine learning techniques, we use this data to automatically classify senators according to party, obtaining accuracy in the 70-95% range depending on the specific method used. We also show that using text to predict DW-NOMINATE scores, a common proxy for ideology, does not improve upon these already-successful results. This classification deteriorates when applied to text from sessions of Congress that are four or more years removed from the training set, pointing to a need on the part of voters to dynamically update the heuristics they use to evaluate party based on political speech. Text-based predictions are less accurate than those based on voting behavior, supporting the theory that roll-call votes represent greater commitment on the part of politicians and are thus a more accurate reflection of their ideological preferences. However, the overall success of the machine learning approaches studied here demonstrates that political speeches are highly predictive of partisan affiliation. In addition to these findings, this work also introduces the computational tools and methods relevant to the use of political speech data.
---
PDF链接:
https://arxiv.org/pdf/1809.00741


雷达卡



京公网安备 11010802022788号







