请选择 进入手机版 | 继续访问电脑版
楼主: 阿袋
755 2

[其他] Using traditional and digital data sources together in economic research [推广有奖]

贵宾

院士

16%

还不是VIP/贵宾

-

TA的文库  其他...

各科好书新书

投资人生

论文写作投稿实战

威望
0
论坛币
568506 个
通用积分
149.9418
学术水平
304 点
热心指数
347 点
信用等级
246 点
经验
88776 点
帖子
1683
精华
5
在线时间
2895 小时
注册时间
2007-6-10
最后登录
2024-3-24

阿袋 发表于 2018-1-25 21:44:42 |显示全部楼层 |坛友微信交流群

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

[size=0.9em]Using traditional and digital data sources together in economic research

[size=0.9em]Edward Glaeser, Hyunjin Kim, Michael Luca 17 January 2018

[size=0.9em]Economic and policy research can often suffer from a scarcity of up-to-date data sources. This column explores the potential for digital data to supplement official government statistics by providing more up-to-date snapshots of the economy. A comparison of data from Yelp with US County Business Patterns data reveals that the Yelp data provide a good indication of underlying economic trends. But although digital data from online platforms offer faster and geographically detailed images of the economy, they should be seen as a complement rather than a substitute for official government statistics.



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Traditional Research Economic together Researc

阿袋 发表于 2018-1-25 21:46:18 |显示全部楼层 |坛友微信交流群

How is Boston’s economy doing? According to the latest data available as of the end of 2017, employment in Suffolk County (which contains Boston) rose from 572,000 to 595,000 between 2014 and 2015. But data from 2015 may seem awfully out of date for policymakers, investors, and voters in Boston’s 2017 mayoral election. This speaks to a broader problem that stymies research and policy alike – it is valuable to have a sense of how the economy is doing this year or even this month, but official government data sources are often released after notoriously long lags.

Traditional public data sources on local economies, including County Business Patterns and other Census products, are typically available after a multi-year lag, and can take even longer when looking for more granular data that come with restricted access. However, private data from platforms such as Yelp, Google, and LinkedIn are essentially available in real time. This raises the potential for digital data to supplement official government statistics, by providing more up-to-date snapshots of the economy. In recent years, research has raised the potential for data sources from online platforms to predict economic outcomes such as inflation, unemployment claims, housing prices, and entrepreneurship (Choi and Varian 2012, Cavallo 2012, Einav and Levin 2014, Wu and Brynjolfsson 2015, Guzman and Stern 2016, Glaeser et al. 2017), as well as to improve policy evaluations (Luca and Luca 2017).

But if online data sources are going to be used to quantify economic activity, it’s important to understand how they compare to the key statistical datasets that have historically been used to measure the economy. With this in mind, we set out to explore the potential for digital data to predict traditional measures of the economy that are widely used by policymakers and academics, and evaluate the conditions under which the data can accurately measure changes in the economy.

In a recent paper, we illustrate how data from Yelp can provide an up-to-date snapshot of the economy of a city or neighbourhood (Glaeser et al. 2017). Yelp’s data can help predict contemporaneous changes in ZIP code-level establishment growth, especially in higher income, higher density parts of America.

Yelp coverage of consumer-facing retailing establishments is much better than its coverage of other sectors. By 2015, Yelp had reviews on 1.4 million businesses, 18% of the number of establishments listed in County Business Patterns. In the restaurant sector, Yelp covers 576,000 restaurants in almost 22,719 ZIP codes, while County Business Patterns has 542,000 restaurants in 24,790 ZIP codes.1 Figure 1 shows a map of Yelp’s coverage across the US in 2015.


使用道具

阿袋 发表于 2018-1-25 21:46:44 |显示全部楼层 |坛友微信交流群

Figure 1 Yelp coverage of CBP restaurants by ZIP code in 2015

We test whether Yelp data predict establishment growth in County Business Patterns for the years prior to 2015 at the ZIP code level. There is strong persistence in establishment growth rates, and so we control for two years of lags in County Business Pattern establishment growth, which, along with year fixed effects, can explain 14.8% of the variation in establishment growth across US ZIP codes. Adding Yelp data from the year in question boosts the r-squared to 22.5%. Our point estimate suggests that one extra Yelp business is associated with 0.6 more businesses in County Business Patterns. In many cases, including today, we don’t even have the one-year lag of County Business Patterns growth, which means that the marginal contribution of Yelp data would be even larger.

Yelp’s predictive power in the restaurant sector is even more impressive, because past restaurant growth doesn’t predict current restaurant growth. Year effects and lagged growth can explain less than 1% of the variation in restaurant growth rates across ZIP codes. Including growth in Yelp-reviewed restaurants boosts the r-squared to 11%. Using a richer set of Yelp variables increases the r-squared to almost 14%.

The fact that Yelp’s current data can predict current economic events suggests that patterns in Yelp data reflect patterns in the underlying economy, rather than simply patterns in the adoption of Yelp. At the same time, Yelp is more predictive in some places than in others, because this online source of information is not used uniformly across places. Richer, denser, and better-educated places have better Yelp coverage, presumably because they have a more internet-savvy population that likes to go out more. We find that one Yelp business is only associated with 0.2 extra County Business Patterns establishments in places that are poorly educated, low income, and less dense. The relatively low coefficient reflects Yelp’s weaker coverage in those areas. The coefficient rises to 0.5 in richer parts of the US, even if they are less dense or less educated. The coefficient increases to almost 0.75 when a ZIP code has money, education, and density. We expect that these spatial differences may decline if Yelp spreads further, but for now, ‘nowcasting’ with Yelp is safer in richer and better-educated cities, such as Boston. Yelp’s predictive power also differs by industry. It provides little ability to predict manufacturing growth, and does best in predicting retail establishment growth and the growth of business and professional services.  

Yelp provides timelier local economic data than County Business Patterns, but it also provides data at a more granular level than is available in the public-facing County Business Patterns data. Block-by-block, street-by-street, Yelp can provide policymakers with recent changes in economic geography. In principle, these data can be a useful input to real estate investors, builders, and even business owners rethinking their location. Furthermore, Yelp makes it possible to measure new outcomes that were never included in traditional data sources. To illustrate in the context of New York City, Figure 2 shows how Yelp data can be used to analyse the types of restaurants that open across neighbourhoods, looking at price levels of their menus.

Figure 2 Number of mid-to-expensive ($$+) yelp restaurants opening per capita in 2015

Ultimately, digital data from online platforms can offer faster and geographically detailed images of the economy, but are a complement rather than a substitute for official government statistics. Our confidence in the value of Yelp data comes entirely from the fact they are is cross-validated with those statistics. The new big data frontier provides enormous opportunities, but it should never be an excuse for cutting the funding of traditional government data, which – when combined with new data sources – will provide a more complete picture of the economy.

References

Cavallo, A (2012), “Scraped Data and Sticky Prices”, MIT Sloan Working Paper.

Choi, H, and H Varian (2012), “Predicting the Present with Google Trends”, Economic Record88:2–9.

Einav, L, and J Levin (2014), "The Data Revolution and Economic Analysis", Innovation Policy and the Economy 14.

Glaeser, E L, H Kim, and M Luca (2017), “Nowcasting the Local Economy: Using Yelp Data to Measure Economic Activity”, NBER Working Paper 24010.

Glaeser, E L, S D Kominers, M Luca, and N Naik (2018), “Big Data and Big Cities: The Promises and Limitations of Improved Measures of Urban Life” Economic Inquiry, 56, 1, 114-137.

Guzman, J, and S Stern (2016), “Nowcasting and Placecasting Entrepreneurial Quality and Performance”, Working Paper.

Luca, D, and M Luca (2017), “Survival of the Fittest: The Impact of the Minimum Wage on Firm Exit” Working paper.

Wu, L, and E Brynjolfsson (2015), “The Future of Prediction: How Google Searches Foreshadow Housing Prices and Sales” in A Goldfarb, S M Greenstein, and C E Tucker (eds.), Economic Analysis of the Digital Economy, Chicago: University of Chicago Press.

Endnotes

[1] These Yelp numbers exclude any businesses in Yelp that are missing a ZIP code, price range, or any recommended reviews.


使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加JingGuanBbs
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-3-29 06:32