-
The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey
February 2025
Working Paper Number:
CES-25-13
The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.
View Full
Paper PDF
-
Measuring Income of the Aged in Household Surveys: Evidence from Linked Administrative Records
June 2024
Working Paper Number:
CES-24-32
Research has shown that household survey estimates of retirement income (defined benefit pensions and defined contribution account withdrawals) suffer from substantial underreporting which biases downward measures of financial well-being among the aged. Using data from both the redesigned 2016 Current Population Survey Annual Social and Economic Supplement (CPS ASEC) and the Health and Retirement Study (HRS), each matched with administrative records, we examine to what extent underreporting of retirement income affects key statistics such as reliance on Social Security benefits and poverty among the aged. We find that underreporting of retirement income is still prevalent in the CPS ASEC. While the HRS does a better job than the CPS ASEC in terms of capturing retirement income, it still falls considerably short compared to administrative records. Consequently, the relative importance of Social Security income remains overstated in household surveys'53 percent of elderly beneficiaries in the CPS ASEC and 49 percent in the HRS rely on Social Security for the majority of their incomes compared to 42 percent in the linked administrative data. The poverty rate for those aged 65 and over is also overstated'8.8 percent in the CPS ASEC and 7.4 percent in the HRS compared to 6.4 percent in the linked administrative data. Our results illustrate the effects of using alternative data sources in producing key statistics from the Social Security Administration's Income of the Aged publication.
View Full
Paper PDF
-
Citizenship Question Effects on Household Survey Response
June 2024
Working Paper Number:
CES-24-31
Several small-sample studies have predicted that a citizenship question in the 2020 Census would cause a large drop in self-response rates. In contrast, minimal effects were found in Poehler et al.'s (2020) analysis of the 2019 Census Test randomized controlled trial (RCT). We reconcile these findings by analyzing associations between characteristics about the addresses in the 2019 Census Test and their response behavior by linking to independently constructed administrative data. We find significant heterogeneity in sensitivity to the citizenship question among households containing Hispanics, naturalized citizens, and noncitizens. Response drops the most for households containing noncitizens ineligible for a Social Security number (SSN). It falls more for households with Latin American-born immigrants than those with immigrants from other countries. Response drops less for households with U.S.-born Hispanics than households with noncitizens from Latin America. Reductions in responsiveness occur not only through lower unit self-response rates, but also by increased household roster omissions and internet break-offs. The inclusion of a citizenship question increases the undercount of households with noncitizens. Households with noncitizens also have much higher citizenship question item nonresponse rates than those only containing citizens. The use of tract-level characteristics and significant heterogeneity among Hispanics, the foreign-born, and noncitizens help explain why the effects found by Poehler et al. were so small. Linking administrative microdata with the RCT data expands what we can learn from the RCT.
View Full
Paper PDF
-
The Long-Term Effects of Income for At-Risk Infants: Evidence from Supplemental Security Income
March 2024
Working Paper Number:
CES-24-10
This paper examines whether a generous cash intervention early in life can "undo" some of the long-term disadvantage associated with poor health at birth. We use new linkages between several large-scale administrative datasets to examine the short-, medium-, and long-term effects of providing low-income families with low birthweight infants support through the Supplemental Security Income (SSI) program. This program uses a birthweight cutoff at 1200 grams to determine eligibility. We find that families of infants born just below this cutoff experience a large increase in cash benefits totaling about 27%of family income in the first three years of the infant's life. These cash benefits persist at lower amounts through age 10. Eligible infants also experience a small but statistically significant increase in Medicaid enrollment during childhood. We examine whether this support affects health care use and mortality in infancy, educational performance in high school, post-secondary school attendance and college degree attainment, and earnings, public assistance use, and mortality in young adulthood for all infants born in California to low-income families whose birthweight puts them near the cutoff. We also examine whether these payments had spillover effects onto the older siblings of these infants who may have also benefited from the increase in family resources. Despite the comprehensive nature of this early life intervention, we detect no improvements in any of the study outcomes, nor do we find improvements among the older siblings of these infants. These null effects persist across several subgroups and alternative model specifications, and, for some outcomes, our estimates are precise enough to rule out published estimates of the effect of early life cash transfers in other settings.
View Full
Paper PDF
-
Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report
April 2023
Authors:
J. David Brown,
Danielle H. Sandler,
Lawrence Warren,
Moises Yi,
Misty L. Heggeness,
Joseph L. Schafer,
Matthew Spence,
Marta Murray-Close,
Carl Lieberman,
Genevieve Denoeux,
Lauren Medina
Working Paper Number:
CES-23-21
This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.
View Full
Paper PDF
-
Age, Sex, and Racial/Ethnic Disparities and Temporal-Spatial Variation in
Excess All-Cause Mortality During the COVID-19 Pandemic: Evidence from Linked Administrative and Census Bureau Data
May 2022
Working Paper Number:
CES-22-18
Research on the impact of the COVID-19 pandemic in the United States has highlighted substantial racial/ethnic disparities in excess mortality, but reports often differ in the details with respect to the size of these disparities. We suggest that these inconsistencies stem from differences in the temporal scope and measurement of race/ethnicity in existing data. We address these issues using death records for 2010 through 2021 from the Social Security Administration, covering the universe of individuals ever issued a Social Security Number, linked to race/ethnicity responses from the decennial census and American Community Survey. We use these data to (1) estimate excess all-cause mortality at the national level and for age-, sex-, and race/ethnicity-specific subgroups, (2) examine racial/ethnic variation in excess mortality over the course of the pandemic, and (3) explore whether and how racial/ethnic mortality disparities vary across states.
View Full
Paper PDF
-
Disclosure Limitation and Confidentiality Protection in Linked Data
January 2018
Working Paper Number:
CES-18-07
Confidentiality protection for linked administrative data is a combination of access modalities and statistical disclosure limitation. We review traditional statistical disclosure limitation methods and newer methods based on synthetic data, input noise infusion and formal privacy. We discuss how these methods are integrated with access modalities by providing three detailed examples. The first example is the linkages in the Health and Retirement Study to Social Security Administration data. The second example is the linkage of the Survey of Income and Program Participation to administrative data from the Internal Revenue Service and the Social Security Administration. The third example is the Longitudinal Employer-Household Dynamics data, which links state unemployment insurance records for workers and firms to a wide variety of censuses and surveys at the U.S. Census Bureau. For examples, we discuss access modalities, disclosure limitation methods, the effectiveness of those methods, and the resulting analytical validity. The final sections discuss recent advances in access modalities for linked administrative data.
View Full
Paper PDF
-
Labor Market Effects of the Affordable Care Act: Evidence from a Tax Notch
July 2017
Working Paper Number:
carra-2017-07
States that declined to raise their Medicaid income eligibility cutoffs to 138 percent of the federal poverty level (FPL) under the Affordable Care Act (ACA) created a "coverage gap'' between their existing, often much lower Medicaid eligibility cutoffs and the FPL, the lowest level of income at which the ACA provides refundable, advanceable "premium tax credits'' to subsidize the purchase of private insurance. Lacking access to any form of subsidized health insurance, residents of those states with income in that range face a strong incentive, in the form of a large, discrete increase in post-tax income (i.e. an upward notch) at the FPL, to increase their earnings and obtain the premium tax credit. We investigate the extent to which they respond to that incentive. Using the universe of tax returns, we document excess mass, or bunching, in the income distribution surrounding this notch. Consistent with Saez (2010), we find that bunching occurs only among filers with self-employment income. Specifically, filers without children and married filers with three or fewer children exhibit significant bunching. Analysis of tax data linked to labor supply measures from the American Community Survey, however, suggests that this bunching likely reflects a change in reported income rather than a change in true labor supply. We find no evidence that wage and salary workers adjust their labor supply in response to increased availability of directly purchased health insurance.
View Full
Paper PDF
-
A Comparison of Training Modules for Administrative Records Use in Nonresponse Followup Operations: The 2010 Census and the American Community Survey
January 2017
Working Paper Number:
CES-17-47
While modeling work in preparation for the 2020 Census has shown that administrative records can be predictive of Nonresponse Followup (NRFU) enumeration outcomes, there is scope to examine the robustness of the models by using more recent training data. The models deployed for workload removal from the 2015 and 2016 Census Tests were based on associations of the 2010 Census with administrative records. Training the same models with more recent data from the American Community Survey (ACS) can identify any changes in parameter associations over time that might reduce the accuracy of model predictions. Furthermore, more recent training data would allow for the
incorporation of new administrative record sources not available in 2010. However, differences in ACS methodology and the smaller sample size may limit its applicability. This paper replicates earlier results and examines model predictions based on the ACS in comparison with NRFU outcomes. The evaluation
consists of a comparison of predicted counts and household compositions with actual 2015 NRFU outcomes. The main findings are an overall validation of the methodology using independent data.
View Full
Paper PDF
-
Medicare Coverage and Reporting
December 2016
Working Paper Number:
carra-2016-12
Medicare coverage of the older population in the United States is widely recognized as being nearly universal. Recent statistics from the Current Population Survey Annual Social and Economic Supplement (CPS ASEC) indicate that 93 percent of individuals aged 65 and older were covered by Medicare in 2013. Those without Medicare include those who are not eligible for the public health program, though the CPS ASEC estimate may also be impacted by misreporting. Using linked data from the CPS ASEC and Medicare Enrollment Database (i.e., the Medicare administrative data), we estimate the extent to which individuals misreport their Medicare coverage. We focus on those who report having Medicare but are not enrolled (false positives) and those who do not report having Medicare but are enrolled (false negatives). We use regression analyses to evaluate factors associated with both types of misreporting including socioeconomic, demographic, and household characteristics. We then provide estimates of the implied Medicare-covered, insured, and uninsured older population, taking into account misreporting in the CPS ASEC. We find an undercount in the CPS ASEC estimates of the Medicare covered population of 4.5 percent. This misreporting is not random - characteristics associated with misreporting include citizenship status, year of entry, labor force participation, Medicare coverage of others in the household, disability status, and imputation of Medicare responses. When we adjust the CPS ASEC estimates to account for misreporting, Medicare coverage of the population aged 65 and older increases from 93.4 percent to 95.6 percent while the uninsured rate decreases from 1.4 percent to 1.3 percent.
View Full
Paper PDF