楼主: oliyiyi
1598 1

How to Search for Census Data [推广有奖]

版主

已卖:2994份资源

泰斗

1%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
66105 个
通用积分
31671.0967
学术水平
1454 点
热心指数
1573 点
信用等级
1364 点
经验
384134 点
帖子
9629
精华
66
在线时间
5508 小时
注册时间
2007-5-21
最后登录
2025-7-8

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

楼主
oliyiyi 发表于 2015-11-17 17:26:02 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
(This article was first published on AriLamstein.com » R, and kindly contributed to R-bloggers)

In my course Learn to Map Census Data in R I provide people with a handful of interesting demographics to analyze. This is convenient for teaching, but people often want to search for other demographic statistics. To address that, today I will work through an example of starting with a simple demographic question and using R to answer it.

Here is my question: I used to live in Japan, and to this day I still enjoy practicing Japanese with native speakers. If I wanted to move from San Francisco to a part of the country that has more Japanese people, where should I move?

Step 1: Find the Table for the Data

Data in the census bureau is stored in tables. One way to find the table for a particular metric is to use the function ?acs.lookup from the acs package. (Note that to run this code you will need to get and install a census API key; I explain how to do that here).


  1. > library(acs)

  2. > acs.lookup(keyword = "Japanese", endyear = 2013)
  3. An object of class "acs.lookup"
  4. endyear= 2013  ; span= 5

  5. results:
  6.   variable.code table.number                                                                    table.name                                   variable.name
  7. 1    B02006_009       B02006                                                Asian Alone By Selected Groups                                       Japanese
  8. 2    B16001_069       B16001 Language Spoken at Home by Ability to Speak English for the Population 5+ Yrs                                      Japanese:
  9. 3    B16001_070       B16001 Language Spoken at Home by Ability to Speak English for the Population 5+ Yrs            Japanese: Speak English 'very well'
  10. 4    B16001_071       B16001 Language Spoken at Home by Ability to Speak English for the Population 5+ Yrs  Japanese: Speak English less than 'very well'
复制代码



The Census Bureau has two “Japanese” tables: the first relates to race and the second to language. For simplicity, let’s focus on race (B02006). The “_009” at the end indicates the column of the table; each column tabulates a different Asian nationality.

Step 2: Get the Data

There are a few ways to get the data from that table into R. One way is to use the function ?acs.fetch in the acs package. If your end result is to map the data with the choroplethr package, however, you might find it easier to use the function ?get_acs_data in the choroplethr package:


  1. > library(choroplethr)

  2. > l = get_acs_data("B02006", "county", column_idx=9)
复制代码

What’s returned is a list with 2 elements. The first element is a data frame with the (region, value) pairs. The second element is the title of the column:


  1. str(l)
  2. List of 2
  3. $ df :'data.frame': 3143 obs. of 2 variables:
  4. ..$ region: num [1:3143] 1001 1003 1005 1007 1009 ...
  5. ..$ value : num [1:3143] 10 25 0 0 0 0 0 103 2 19 ...
  6. $ title: chr "Asian Alone By Selected Groups: Japanese"
复制代码

Step 3: Analyze the Data

The first way to analyze the data is to simply look at the data frame:



  1. > df = l[[1]]

  2. > head(df)
  3. region value
  4. 1 1001 10
  5. 2 1003 25
  6. 3 1005 0
  7. 4 1007 0
  8. 5 1009 0
  9. 6 1011 0
复制代码

People who have taken my course will recognize the regions asFIPS County Codes. We can use a boxplot to look at the distribution of values:

boxplot(df$value)
[color=rgb(255, 255, 255) !important]


I draw two conclusions from this chart: 1) the median is very low and 2) there are two very large outliers.

To find out the names of the outliers we need to convert the FIPS Codes to English. We can do that by merging df with the data frame ?county.regions.

> data(county.regions)> head(county.regions)   region county.fips.character county.name state.name state.fips.character state.abb1    1001                 01001     autauga    alabama                   01        AL36   1003                 01003     baldwin    alabama                   01        AL55   1005                 01005     barbour    alabama                   01        AL15   1007                 01007        bibb    alabama                   01        AL2    1009                 01009      blount    alabama                   01        AL16   1011                 01011     bullock    alabama                   01        AL> df2 = merge(df, county.regions)> df2 = df2[order(-df2$value), ]> head(df2)     region  value county.fips.character county.name state.name state.fips.character state.abb548   15003 150984                 15003    honolulu     hawaii                   15        HI205    6037 103180                 06037 los angeles california                   06        CA216    6059  33211                 06059      orange california                   06        CA229    6085  28144                 06085 santa clara california                   06        CA2971  53033  21493                 53033        king washington                   53        WA223    6073  18592                 06073   san diego california                   06        CA

So the outliers are Honolulu county and Los Angeles county. San Francisco isn’t even in the top 6. So if I ever decide to give up my career in technology for a career focused on Japanese, I should move to Honolulu!

It’s also easy to create a choropleth map of the values. This allows us to see the geographic distribution of the values.

library(choroplethrMaps)county_choropleth(df, title = "2012 County Estimates:nNumber of Japanese per County")
[color=rgb(255, 255, 255) !important]


According to this map, by living on the west coast I am already in a part of the country with a high concentration of Japanese people.

Conclusion

If you wind up using this blog post to do an analysis of your own, or have difficulty adapting this code to your own purposes, please leave a comment below. I’m always interested in hearing what my readers are working on.

A final note to my Japanese friends: どう思いますか?アメリカで一番興味がある場所はホノルルとロサンゼルスですか?口コミしてください!


LEARN TO MAP CENSUS DATA
Subscribe and get my free email course: Mapping Census Data in R!




100% Privacy. We don’t spam.



The post How to Search for Census Data from R appeared first on AriLamstein.com.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:search Census Data ARCH ARC convenient published question starting article

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

沙发
oliyiyi 发表于 2015-12-1 11:08:20
Temperature dependent optimal power flow using g-best guided artificial bee col...
International Journal of Electrical Power & Energy Systems
Sandbar and beach-face evolution on a prototype coarse sandy barrier
Coastal Engineering
On the long run effects of market splitting: Why more price zones might decreas...
Energy Policy
Comparisons of several algorithms for Toeplitz matrix recovery
Computers & Mathematics with Applications
Rationale for anti-OX40 cancer immunotherapy
European Journal of Cancer

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-5 02:29