This report summarizes matching of survey, commercial, and administrative records housing units to the Census Bureau Master Address File (MAF). We document overall MAF match rates in each data set and evaluate differences in match rates across a variety of housing characteristics. Results show that over 90 percent of records in survey data from the American Housing Survey (AHS) match to the MAF. Commercial data from CoreLogic matches at much lower rates, in part due to missing address information and poor match rates for multi-unit buildings. MAF match rates for administrative records from the Department of Housing and Urban Development are also high, and open the possibility of using this information in surveys such as the AHS.
-
Comparing the 2019 American Housing Survey to Contemporary Sources of Property Tax Records: Implications for Survey Efficiency and Quality
June 2022
Working Paper Number:
CES-22-22
Given rising nonresponse rates and concerns about respondent burden, government statistical agencies have been exploring ways to supplement household survey data collection with administrative records and other sources of third-party data. This paper evaluates the potential of property tax assessment records to improve housing surveys by comparing these records to responses from the 2019 American Housing Survey. Leveraging the U.S. Census Bureau's linkage infrastructure, we compute the fraction of AHS housing units that could be matched to a unique property parcel (coverage rate), as well as the extent to which survey and property tax data contain the same information (agreement rate). We analyze heterogeneity in coverage and agreement across states, housing characteristics, and 11 AHS items of interest to housing researchers. Our results suggest that partial replacement of AHS data with property data, targeted toward certain survey items or single-family detached homes, could reduce respondent burden without altering data quality. Further research into partial-replacement designs is needed and should proceed on an item-by-item basis. Our work can guide this research as well as those who wish to conduct independent research with property tax records that is representative of the U.S. housing stock.
View Full
Paper PDF
-
Matching Addresses between Household Surveys and Commercial Data
July 2015
Working Paper Number:
carra-2015-04
Matching third-party data sources to household surveys can benefit household surveys in a number of ways, but the utility of these new data sources depends critically on our ability to link units between data sets. To understand this better, this report discusses potential modifications to the existing match process that could potentially improve our matches. While many changes to the matching procedure produce marginal improvements in match rates, substantial increases in match rates can only be achieved by relaxing the definition of a successful match. In the end, the results show that the most important factor determining the success of matching procedures is the quality and composition of the data sets being matched.
View Full
Paper PDF
-
Using Linked Survey and Administrative Data to Better Measure Income: Implications for Poverty, Program Effectiveness and Holes in the Safety Net
October 2015
Working Paper Number:
CES-15-35
We examine the consequences of underreporting of transfer programs in household survey data for several prototypical analyses of low-income populations. We focus on the Current Population Survey (CPS), the source of official poverty and inequality statistics, but provide evidence that our qualitative conclusions are likely to apply to other surveys. We link administrative data for food stamps, TANF, General Assistance, and subsidized housing from New York State to the CPS at the individual level. Program receipt in the CPS is missed for over one-third of housing assistance recipients, 40 percent of food stamp recipients and 60 percent of TANF and General Assistance recipients. Dollars of benefits are also undercounted for reporting recipients, particularly for TANF, General Assistance and housing assistance. We find that the survey data sharply understate the income of poor households, as conjectured in past work by one of the authors. Underreporting in the survey data also greatly understates the effects of anti-poverty programs and changes our understanding of program targeting, often making it seem that welfare programs are less targeted to both the very poorest and middle income households than they are. Using the combined data rather than survey data alone, the poverty reducing effect of all programs together is nearly doubled while the effect of housing assistance is tripled. We also re-examine the coverage of the safety net, specifically the share of people without work or program receipt. Using the administrative measures of program receipt rather than the survey ones often reduces the share of single mothers falling through the safety net by one-half or more.
View Full
Paper PDF
-
Correctional Facility and Inmate Locations: Urban and Rural Status Patterns
July 2017
Working Paper Number:
carra-2017-08
As the incarcerated population grew from the 1980s through the late 2000s, so too did the number of correctional facilities. An increasing number of these facilities have been constructed in rural areas. While research has shown there has been growth in prisons and prisoners in rural areas, there are no recent national-level statistics regarding the urban-rural status of correctional facilities and inmates, the urban-rural status of inmates prior to prison, or an accounting of how many inmates from urban or rural areas are incarcerated in urban and rural facilities. Using 2010 decennial census and Bureau of Justice Statistics' 2004 Survey of Prison Inmates data we describe these patterns. We find that a disproportionate share of prisons and inmates are located in rural areas, while a disproportionate share of inmates are from urban areas. Our research could inform discussions about the potential consequences of Census Bureau residence criteria for inmates.
View Full
Paper PDF
-
Errors in Survey Reporting and Imputation and Their Effects on Estimates of Food Stamp Program Participation
April 2011
Working Paper Number:
CES-11-14
Benefit receipt in major household surveys is often underreported. This misreporting leads to biased estimates of the economic circumstances of disadvantaged populations, program takeup, and the distributional effects of government programs, and other program effects. We use administrative data on Food Stamp Program (FSP) participation matched to American Community Survey (ACS) and Current Population Survey (CPS) household data. We show that nearly thirty-five percent of true recipient households do not report receipt in the ACS and fifty percent do not report receipt in the CPS. Misreporting, both false negatives and false positives, varies with individual characteristics, leading to complicated biases in FSP analyses. We then directly examine the determinants of program receipt using our combined administrative and survey data. The combined data allow us to examine accurate participation using individual characteristics missing in administrative data. Our results differ from conventional estimates using only survey data, as such estimates understate participation by single parents, non-whites, low income households, and other groups. To evaluate the use of Census Bureau imputed ACS and CPS data, we also examine whether our estimates using survey data alone are closer to those using the accurate combined data when imputed survey observations are excluded. Interestingly, excluding the imputed observations leads to worse ACS estimates, but has less effect on the CPS estimates.
View Full
Paper PDF
-
2010 American Community Survey Match Study
July 2014
Working Paper Number:
carra-2014-03
Using administrative records data from federal government agencies and commercial sources, the 2010 ACS Match Study measures administrative records coverage of 2010 ACS addresses, persons, and persons at addresses at different levels of geography as well as by demographic characteristics and response mode. The 2010 ACS Match Study represents a continuation of the research undertaken in the 2010 Census Match Study, the first national-level evaluation of administrative records data coverage. Preliminary results indicate that administrative records provide substantial coverage for addresses and persons in the 2010 ACS (92.7 and 92.1 percent respectively), and less extensive though substantial coverage, for person-address pairs (74.3 percent). In addition, some variation in address, person and/or person-address coverage is found across demographic and response mode groups. This research informs future uses of administrative records in survey and decennial census operations to address the increasing costs of data collection and declining response rates.
View Full
Paper PDF
-
Evaluation of Commercial School and Teacher Lists to Enhance Survey Frames
July 2014
Working Paper Number:
carra-2014-07
This report summarizes the potential for teacher lists obtained from commercial vendors for enhancing sampling frames for the National Teacher and Principal Survey (NTPS). We investigate three separate vendor lists, and compare coverage rates across a range of school and teacher characteristics. Across all vendors, coverage rates are higher for regular, non-charter schools. Vendor A stands out as having higher coverage rates than the other two, and we recommend further evaluating Vendor A's teacher lists during the upcoming 2014-2015 NTPS Field Test.
View Full
Paper PDF
-
Foreign-Born and Native-Born Migration in the U.S.: Evidence from IRS Administrative and Census Survey Records
July 2018
Working Paper Number:
carra-2018-07
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration among foreign-born and native-born populations in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses for native-born and foreign-born respondents in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations, such as the foreign-born, that may be difficult to reach with traditional Census Bureau surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential application of our longitudinal IRS dataset to innovations in migration research on both the native-born and foreign-born populations of the United States.
View Full
Paper PDF
-
The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey
February 2025
Working Paper Number:
CES-25-13
The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.
View Full
Paper PDF
-
Exploring Administrative Records Use for Race and Hispanic Origin Item Non-Response
December 2014
Working Paper Number:
carra-2014-16
Race and Hispanic origin data are required to produce official statistics in the United States. Data collected through the American Community Survey and decennial census address missing data through traditional imputation methods, often relying on information from neighbors. These methods work well if neighbors share similar characteristics, however, the shape and patterns of neighborhoods in the United States are changing. Administrative records may provide more accurate data compared to traditional imputation methods for missing race and Hispanic origin responses. This paper first describes the characteristics of persons with missing demographic data, then assesses the coverage of administrative records data for respondents who do not answer race and Hispanic origin questions in Census data. The paper also discusses the distributional impact of using administrative records race and Hispanic origin data to complete missing responses in a decennial census or survey context.
View Full
Paper PDF