楼主: 夸克之一
43789 123

[国际统计年鉴] 世界微观数据库链接总结(Markus Eberhardt)   [推广有奖]

11
夸克之一 发表于 2013-1-31 08:58:42
Household- and Individual-level/cohort dataBob Baulch at the Chronic Poverty Research Centre at the University of Manchester has compiled an annotated listing of[color=rgb(0, 137, 201) !important]Household Panel Data Sets in Developing and Transition Countries, featuring among many others the data used for his own work in Pakistan, Vietnam and Bangladesh. The listing is by country and includes information on the waves/years, sample size and major references. [via [color=rgb(0, 137, 201) !important]DEVECONDATA by Masa Kudamatsu]

The International Household Survey Network provides access to over [color=rgb(0, 137, 201) !important]3,400 household-level datasets . This includes data on from agriculture to child labour to LSMS, income, expenditure... In most cases the link takes you not straight to the data, but to the website of the project or organisation, so may have to search around for a while.

The Bureau for Research and Economic Analysis of Development (BREAD) provides [color=rgb(0, 137, 201) !important]links to a large number of household-level datasets, including among others Family Life Surveys, University of North Carolina Surveys, University of Washington CSDE Vietnam Research Projects, Rural Economic and Demographic Survey (REDS), India Agriculture and Climate Data Set, Indian National Sample Survey Organization, Learning and Education Achievement in Punjab Schools, Colombian Familas en Accion, World Bank Living Standards Measurement Study.

The LSE's development department STICERD (The Suntory and Toyota International Centres for Economics and Related Disciplines) has a [color=rgb(0, 137, 201) !important]"virtual center" for fieldwork in Development Economics. This not only includes datasets and related materials (questionnaires etc.) but also resources related to methodology, including 'The Basics of Developing Questionnaires'.

The Rural Income Generating Activities (RIGA) [color=#089c9 !important]project has created an internationally comparable database of household income sources from existing household living standards surveys for low and middle-income countries. Most of the surveys used by the RIGA project were developed by national statistical offices in conjunction the World Bank as part of its Living Standards Measurement Study. The database is maintained by the FAO. At present the database incorporates 27 surveys covering 16 countries in Africa, Asia, Eastern Europe and Latin America. In addition RIGA provides a [color=rgb(0, 137, 201) !important]link to research papersthat have used the data [thanks to [color=rgb(0, 137, 201) !important]Alberto Zezza at the FAO for letting me know].

Since 1984, the MEASURE DHS (Demographic and Health Surveys) project has provided technical assistance to more than 200 surveys in 75 countries, advancing global understanding of health and population trends in developing countries. DHS are funded by USAID with contributions from other donors. Data are currently collected under the umbrella of the Measure project which is administered by [color=rgb(0, 137, 201) !important]Macro International . Data have been collected in four waves: DHS-I (1986-90), DHS-II (1991-1992), DHS-III (1993-1997), Measure (1998-present).
As part of a project analyzing poverty and social assistance in the transition economies a team at the World Bank under the guidance of Branko Milatovic have created [color=#089c9 !important]HEIDE (Household Expenditure and Income Data for Transitional Economies), a very large integrated household and individual-level data for nine Eastern European economies in 1993. The (Stata) data covers expenditure, income, assets, household descriptives, individual characteristics and amounts to a total of around 3 million observations. There are files describing variables, data cleaning etc. and a link to a working paper about the project. [This link features on Stefania Lovo's [color=#089c9 !important]website].

Britain's ippr in partnership with the Global Development Network (GDN) provides data from a [color=#089c9 !important]major project on migration and development, aimed to assess migration’s impacts, collect evidence on those impacts, help to build research capacity on migration and development issues in developing countries and examine fresh policy options for improving migration’s contribution to development. Apart from rich qualtitative data the researchers collected new nationally-representative household surveys in Colombia, Fiji, Georgia, Ghana, Jamaica, Macedonia and Vietnam. The final implemented survey questionnaires are also provided alongside the datasets, which are provided in Stata format. [This project was featured in a recent tweet by CGD's Michael Clemens [color=#089c9 !important]@m_clem]

The World Bank's [color=#089c9 !important]Living Standards Measurement Study
(LSMS) offers publications, tools and most importantly access to household-level surveys it has been collecting since 1985.

The World Bank also has a dedicated [color=#089c9 !important]African Household Survey Databank.

The Mexican Family Life Survey ([color=#089c9 !important]MxFLS) is a multi-thematic and longitudinal database which collects, with a single scientific tool, a wide range of information on socioeconomic indicators, demographics and health indicators on the Mexican population. MxFLS is the first Mexican survey with national representation departing from a longitudinal design, tracking the Mexican population for long periods of time regardless of migration decisions with the objective of studying the dynamics of economy, demographics, epidemiology, and population migration throughout this panel study of at least, a 10-year span. The data can be downloaded in Stata format.

The Washington-based Education Policy and Data Center ([color=#089c9 !important]EPDC) "provides global education data, tools for data visualization, and policy-oriented analysis aimed at improving schools and learning in developing countries." They say they have "the world’s largest international education database with over 3.8 millon data points from 200 countries. The data comes from national and international websites including household survey datasets as well as studies and reports." This is not just macro data, but also household surveys and census data; another very useful thing they do is to provide Stata do-files to construct indicators from the hh data.



12
夸克之一 发表于 2013-1-31 08:59:11
The National Statistical Office of Bolivia provides [color=rgb(0, 137, 201) !important]access to a number of demographic and health surveys, as well as income expenditure surveys for the 1989-2009 period. The website is in Spanish and registration (free) is required. [Thanks to [color=rgb(0, 137, 201) !important]Gustavo Canavire-Bacarreza, graduate student at Georgia State in Atlanta, for the link]

UNICEF assists countries in collecting and analyzing data in order to fill data gaps for monitoring the situation of children and women through its international household survey initiative the [color=rgb(0, 137, 201) !important]Multiple Indicator Cluster Surveys
(MICS). The first round of MICS was conducted around 1995 in more than 60 countries; second round of surveys was conducted in 2000 (around 65 surveys); the third round (50 countries) in 2005-06; the fourth round of Multiple Indicator Cluster Surveys (MICS) is scheduled for 2009-2011 and survey results are expected to be available from 2010 on. Data coverage: in MICS3, as in the previous rounds, three model questionnaires were developed: a household questionnaire, a questionnaire for women aged 15-49, and a questionnaire for children under the age of 5 (addressed to the mother or primary caretaker of the child). [via Sebastian Bauhoff @Harvard]

Conducted by the World Bank in January/February 2006 (covering 2005 but with some recall data for 2002) the [color=rgb(0, 137, 201) !important]Indonesian Rural Investment Climate Survey (RICS) is an in-depth, quantitative survey of 2549 non-farm enterprises, 2782 households and 149 communities in 6 rural Kabupaten. The RIC Survey data provides the first representative snapshot of the investment climate in six different types of rural Kabupaten, allowing policymakers to identify and address the key constraints to investment and growth. Data is provided in SPSS and Stata format, together with full documentation. [Via Masa Kudamatsu at [color=rgb(0, 137, 201) !important]DEVECONDATA]

The International Food Policy Research Institute (IFRPI) offers a wide range of household and community-level surveys on its [color=rgb(0, 137, 201) !important]data website. Chief among these is the set of Ethiopian Rural Household Surveys (ERHS), collected in 6 waves between 1989 and 2004, which is provided with all additional information, questionnaires etc. Note that despite the Amazon-style lingo ('Basket', 'Proceed to checkout') all you need to do is register on the site: then you can access/download all of the datasets featured. The datasets can also be accessed from the [color=rgb(0, 137, 201) !important]IFPRI Dataverse entry.

Chris Udry at Yale's Economic Growth Center (EGC) provides access to [color=rgb(0, 137, 201) !important]household survey data. The [color=rgb(0, 137, 201) !important]introduction to the surveys states that "The surveys would begin with a (clustered) random sample of approximately 5,000 households in 200 communities in rural and urban areas of each country. Every three years following the initial survey, a (stratified) random sample of each individual in the original 5,000 households would be followed for re-interviews." Other than the above document there is not much obvious documentation, but there is data for Ghana and Nigeria, some of it in Stata format (with do-files).The Russia Longitudinal Monitoring Survey ([color=rgb(0, 137, 201) !important]RLMS) is a series of nationally representative surveys designed to monitor the effects of Russian reforms on the health and economic welfare of households and individuals in the Russian Federation. These effects are measured by a variety of means: detailed monitoring of individuals' health status and dietary intake, precise measurement of household-level expenditures and service utilization, and collection of relevant community-level data, including region-specific prices and community infrastructure data. Data have been collected 19 times since 1992. Of these, 15 represent the RLMS Phase II, which has been run jointly by the Carolina Population Center at the University of North Carolina at Chapel Hill, headed by Barry M. Popkin, and the Demoscope team in Russia, headed by Polina Kozyreva and Mikhail Kosolapov. You need to register to get access to the data and describe your research project. In return the website is probably one of the best I've come across to give information about the data and what has been done with it [This link features on Stefania Lovo's [color=rgb(0, 137, 201) !important]website].

The [color=rgb(0, 137, 201) !important]Townsend Thai Project
(initiated and headed by Robert Townsend at MIT) data include both annual and monthly panels, in addition to the collection of environmental data. Originally the Townsend Thai survey focused on villages in four provinces, two in the Northeast and two in the Central region. The baseline survey was conducted in 1997. To date, the Townsend Thai project continues to resurvey the annual and monthly panels. In 2006, the annual surveys extended to include urban areas in the same four provinces. In 2003, an annual survey of villages in the South was added and in 2004, two provinces in the north were included in the annual survey. The project emerged as a means to understand the broader economic and social context in which policies are enacted and research is conducted. Its goal is to build a bridge between policy and research by providing rich data from which academics and policy-makers alike can better understand household activities and behavior, as well as their relationship to the broader regional and national economy.
Sebastian Bauhoff at Harvard offers some [color=rgb(0, 137, 201) !important]links to household-level datasets for China
at various US universities, including primarily data on health and population.

If you are interested in calorie consumption, you need to convert the amounts of food consumption (collected from household surveys) to obtain the data. Annex 1 of the FAO (2001)'s [color=rgb(0, 137, 201) !important]Food Balance Sheets: A Handbook
provides the conversion factors (how many kilo calories 100 grams of food contain) for a wide variety of foods for international use - note that this data is contained in a pdf, not in excel or STATA. For India consult Gopalan, Sastri, and Balasubramanian's book entitled Nutritive Value of Indian Foods (Hyderabad: National Institute of Nutrition, 1971)  [thanks to Masa at [color=rgb(0, 137, 201) !important]DEVECONDATAfrom which both of these links are lifted].




13
夸克之一 发表于 2013-1-31 08:59:34
Nancy Qian at Yale has links to a number of Chinese household surveys on her [color=rgb(0, 137, 201) !important]website, including the [color=rgb(0, 137, 201) !important]China Health and Nutrition Survey (CHNS) at University of North Carolina Population Center as well as the familiar CHIP data (China Household Income Project) available through ICPSR.

Stefan Dercon at Oxford University provides links to a number of [color=rgb(0, 137, 201) !important]datasets he has helped collect, including a Rural Household Survey for Ethiopia (panel), the Kegara Health and Development Survey (Tanzania, panel) and ICRISAT data, as well as Young Lives (see separate entry below). Entirely unrelated, Stefan also provides [color=rgb(0, 137, 201) !important]this gem.

RAND has a number of [color=rgb(0, 137, 201) !important]Family Life Surveys on their website, includings surveys for Malaysia, Indonesia, Guatemala and a region in Bangladesh called Matlab. The website gives a lot of information about the data available.

The [color=rgb(0, 137, 201) !important]Office of Population Research at Princeton University provides access to data from the Mexican Migration Project, the Latin American Migration Project and the World Fertility Surveys (WFS) which were conducted in 41 countries during the 1970s and early 1980s. This is a very good site to find out about data on fertility including the [color=rgb(0, 137, 201) !important]Chinese In-Depth Fertility Surveys.

The [color=rgb(0, 137, 201) !important]Young Lives project at the University of Oxford combines quantitative and qualitative data for childhood poverty in four developing countries. The study is being conducted in Ethiopia, India (in the Andhra Pradesh state), Peru and Vietnam. The study aims to follow 2,000 children (aged approximately 1 year in 2002) and their households, from both urban and rural communities, in each of the four countries (8,000 children in total) for a period of 15 years. Quant waves are in 2002, 2006, 2009, 2012 and 2015, qual waves in 2007, 2009, 2012 and 2015. They've also created the [color=rgb(0, 137, 201) !important]'Virtual Village', which is quite an effort to visualise data in a new format.

The Malawi Diffusion and Ideational Change Project ([color=rgb(0, 137, 201) !important]MDICP) is a collaboration by people at UPenn and two medical colleges in Malawi. The focus of the study is on the roles of social interactions in (1) the acceptance (or rejection) of modern contraceptive methods and of smaller ideal family size; and (2) the diffusion of knowledge of AIDS symptoms and transmission mechanisms and the evaluation of acceptable strategies of protection against AIDS. The website provides a great deal of information about this and a sister project in [color=rgb(0, 137, 201) !important]Kenya, including papers, qualitative surveys and the quants data. [featured by Masa on [color=rgb(0, 137, 201) !important]Devecondata]

14
夸克之一 发表于 2013-1-31 09:01:55
Disaggregated Conflict DataACLED ([color=rgb(0, 137, 201) !important]Armed Conflict Location and Events Dataset), compiled by the Centre for the Study of Civil War (CSCW) at thePeace Research Institute Oslo (PRIO), "is designed for disaggregated conflict analysis and crisis mapping. This dataset codes the location of all reported conflict events in 50 countries in the developing world. Data are currently being coded from 1997 to early 2010 and the project continues to backdate conflict information for African states to the year of independence. These data contain information on the date and location of conflict events, the type of event, the rebel and other groups involved, and changes in territorial control. Specifics on battles, killings, riots, and recruitment activities by rebels, governments, militias, armed groups, protesters and civilians are collected. Events are derived from a variety of sources, mainly concentrating on reports from war zones, humanitarian agencies, and research publications. These data can be used in any GIS, any mapping program, or statistical package." The website also provides links to existing research using this data. [Thanks to [color=rgb(0, 137, 201) !important]Anke Hoeffler at CSAE]

The Households in Conflict Network, funded by The Leverhulme Trust and supported by the Institute of Development Studies at Sussex, the German Institute for Economic Research (DIW) in Berlin and the University of Antwerp, has a [color=rgb(0, 137, 201) !important]Resource & Data website where they provide Philip Verwimp's dataset on victims of genocide in Kibuye, Rwanda (Stata file). This aside the site contains a lot of information on this research topic.

15
夸克之一 发表于 2013-1-31 09:02:30
Other surveysThe International Labour Organisation's (ILO) [color=#089c9 !important]International Programme on the Elimination of Child Labour (IPEC) collects data on the extent, characteristics and determinants of child labour. The micro datasets (mostly cross-sections) are predominantly for African and Latin American countries (data for a total of 30 countries). Their website further contains additional documentation such as the questionnaires, publications and reports compiled from the data.

The Learning and Educational Achievement in Punjab Schools Survey ([color=#089c9 !important]LEAPS) project is run by "the World Bank, Pomona College and Harvard University in collaboration with the Government of Punjab and highly trained local counterparts". "The LEAPS Survey consists of data from 823 schools in 112 villages in 3 districts of Punjab. [...] To measure learning outcomes, the LEAPS project administered detailed exams on English, Math, and Urdu to students in Grade III, then followed those same children and tested them again in Grade IV, Grade V, and Grade VI. Teachers were also tested and given extensive surveys so that child-learning outcomes could be linked to teacher qualifications, and parents were surveyed to provide information on educational contributions made at home."
Fellow CSAE member [color=#089c9 !important]Andy Zeitlin provides [color=#089c9 !important]data and background material on a project which investigates the impact of strengthening information flows on learning outcomes in rural, government primary schools in Uganda. "The baseline survey includes data collected in 100 schools, in 4 districts.  This field exercise included collection of a school-level survey instrument, standardized testing of pupils in P3 and P6, and individual questionnaires administered to a sample of head teachers, teachers, School Management Committee members, and parents. Data from the baseline survey are available in Stata format, together with supporting documentation." You should also check out the papers with my colleague [color=#089c9 !important]Abigail Barr Andy has written using the data, which are available in the 'Research' section of his website.




16
夸克之一 发表于 2013-1-31 09:10:31
Randomized and other ExperimentsThe Abdul Latif Jameel Poverty Action Lab (J-PAL) at MIT, home to the powerful new social science tool of randomized experiments, has links to the data used in some of the Randomistas' work. There are the [color=#089c9 !important]datasets for the textbooks, remedial education, teacher absenteeism, women as policymakers, healthcare and microfinance experiments.

Data from the path-breaking conditional cash transfer randomised experiment Progresa (now renamed Oportunitades) in Mexico can be found [color=#089c9 !important]here.
[color=#089c9 !important]John List at University of Chicago has created a [color=#089c9 !important]website where he lists "publications and discussion papers in experimental economics that make use of the 'field' in some manner". The information includes a link to the paper, year of publication and sometimes JEL codes. Papers are classified into three categories: "1. Artefactual field experiments, which are the same as conventional lab experiments but with a non-standard subject pool (i.e., non-students). Running Peruvian borrowers through lab games (Karlan, 2005 AER) would be an example of an artefactual field experiment. 2. Framed field experiments, which are identical to artefactual field experiments but with field context in either the commodity, task, or information set that the subjects use. An example would be work that elicits valuations for public goods that occur naturally in the environment of the subjects (see some of Bohm's work). 3. Natural field experiments, which are identical to framed field experiments except that the subjects do not know that they are participants in an experiment. An example could be found among the recent surge in fundraising experiments (see, e.g., List and Lucking-Reiley, 2002, JPE)."



17
夸克之一 发表于 2013-1-31 09:13:11
MiscellaneousJoshua Angrist at MIT has made all of the datasets used in his papers available on [color=rgb(0, 137, 201) !important][url=http://econ-www.mit.edu/faculty/angrist/data1/data]his website[/url].

Bob Allen's website at Nuffiled has links to [color=rgb(0, 137, 201) !important]historical wage and price data for a number of countries, cities and occupations respectively.

The [color=rgb(0, 137, 201) !important]Russia Longitudinal Monitoring Survey (RLMS) is a series of nationally representative surveys designed to monitor the effects of Russian reforms on the health and economic welfare of households and individuals in the Russian Federation.  These effects are measured by a variety of means: detailed monitoring of individuals' health status and dietary intake; precise measurement of household-level expenditures and service utilization; and collection of relevant community-level data, including region-specific prices and community infrastructure data. Data have been collected sixteen times since 1992. The project is based at the University of North Carolina at  Chapel Hill and directed by Barry Popkin.

18
wujun0329 发表于 2013-1-31 09:16:43
感谢楼主分享

19
hanxianfeng 发表于 2013-1-31 10:55:26

20
jiangbogz 发表于 2013-1-31 10:59:51
微观数据地址,收藏了先。谢谢斑竹。
看庭前花开花落;
望天上云卷云舒。

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
扫码
拉您进交流群
GMT+8, 2026-1-16 14:55