CREAT - Census Bureau

The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey

February 2025

Written by: Jonathan Eggleston, Mark Klee, Linden McBride

Working Paper Number:

CES-25-13

Abstract

The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.

Document Tags and Keywords

Keywords:

data census, census data, survey, respondent, agriculture, subsidy, population, poverty, ssa, sampling, sample, household surveys, survey households, medicaid, census survey, datasets, eligibility, eligible, prevalence, assessed, enrolled, population survey

Tags:

Internal Revenue Service, Social Security Administration, Center for Economic Studies, Current Population Survey, Department of Agriculture, Housing and Urban Development, Survey of Income and Program Participation, Social Security, Department of Housing and Urban Development, American Community Survey, Protected Identification Key, Medicaid Services, Centers for Medicare, Master Address File, Census Bureau Disclosure Review Board, 2010 Census, ASEC, Person Validation System, Federal Poverty Level, Supplemental Nutrition Assistance Program, MAFID, Census Numident, Census Bureau Person Identification Validation System

Similar Working Papers

The 10 most similar working papers to the working paper 'The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey' are listed below in order of similarity.

Working Paper

Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report

April 2023

Authors: J. David Brown, Danielle H. Sandler, Lawrence Warren, Moises Yi, Misty L. Heggeness, Joseph L. Schafer, Matthew Spence, Marta Murray-Close, Carl Lieberman, Genevieve Denoeux, Lauren Medina

Working Paper Number:

CES-23-21

This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.
View Full Paper PDF
Working Paper

Incorporating Administrative Data in Survey Weights for the 2018-2022 Survey of Income and Program Participation

October 2024

Authors: Jonathan Eggleston, Julia Yang

Working Paper Number:

CES-24-58

Response rates to the Survey of Income and Program Participation (SIPP) have declined over time, raising the potential for nonresponse bias in survey estimates. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we modify various parts of the SIPP weighting algorithm to incorporate such data. We create these new weights for the 2018 through 2022 SIPP panels and examine how the new weights affect survey estimates. Our results show that before weighting adjustments, SIPP respondents in these panels have higher socioeconomic status than the general population. Existing weighting procedures reduce many of these differences. Comparing SIPP estimates between the production weights and the administrative data-based weights yields changes that are not uniform across the joint income and program participation distribution. Unlike other Census Bureau household surveys, there is no large increase in nonresponse bias in SIPP due to the COVID-19 Pandemic. In summary, the magnitude and sign of nonresponse bias in SIPP is complicated, and the existing weighting procedures may change the sign of nonresponse bias for households with certain incomes and program benefit statuses.
View Full Paper PDF
Working Paper

Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children

January 2025

Authors: Renuka Bhaskar, Leah R. Clark

Working Paper Number:

CES-25-03

Researchers and practitioners increasingly rely on administrative data sources to measure family income. However, administrative data sources are often incomplete in their coverage of the population, giving rise to potential bias in family income measures, particularly if coverage deficiencies are not well understood. We focus on the school-aged child population, due to its particular import to research and policy, and because of the unique challenges of linking children to family income information. We find that two of the most significant administrative sources of family income information that permit linking of children and parents'IRS Form 1040 and SNAP participation records'usefully complement each other, potentially reducing coverage bias when used together. In a case study considering how best to measure economic disadvantage rates in the public school student population, we demonstrate the sensitivity of family income statistics to assumptions about individuals who do not appear in administrative data sources.
View Full Paper PDF
Working Paper

Determination of the 2020 U.S. Citizen Voting Age Population (CVAP) Using Administrative Records and Statistical Methodology Technical Report

October 2020

Authors: John M. Abowd, J. David Brown, Lawrence Warren, Moises Yi, Misty L. Heggeness, William R. Bell, Michael B. Hawes, Andrew Keller, Vincent T. Mule Jr., Joseph L. Schafer, Matthew Spence

Working Paper Number:

CES-20-33

This report documents the efforts of the Census Bureau's Citizen Voting-Age Population (CVAP) Internal Expert Panel (IEP) and Technical Working Group (TWG) toward the use of multiple data sources to produce block-level statistics on the citizen voting-age population for use in enforcing the Voting Rights Act. It describes the administrative, survey, and census data sources used, and the four approaches developed for combining these data to produce CVAP estimates. It also discusses other aspects of the estimation process, including how records were linked across the multiple data sources, and the measures taken to protect the confidentiality of the data.
View Full Paper PDF
Working Paper

Gradient Boosting to Address Statistical Problems Arising from Non-Linkage of Census Bureau Datasets

June 2024

Authors: Narayan Sastry, Todd Gardner, Matthew Cefalu, John Sullivan, Elizabeth Fussell

Working Paper Number:

CES-24-27

This article introduces the twangRDC package, which contains functions to address non-linkage in US Census Bureau datasets. The Census Bureau's Person Identification Validation System facilitates data linkage by assigning unique person identifiers to federal, third party, decennial census, and survey data. Not all records in these datasets can be linked to the reference file and as such not all records will be assigned an identifier. This article is a tutorial for using the twangRDC to generate nonresponse weights to account for non-linkage of person records across US Census Bureau datasets.
View Full Paper PDF
Working Paper

Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children

March 2024

Authors: Jennifer Bernard, Kelsey Drotning, Katie R. Genadek

Working Paper Number:

CES-24-18

This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
View Full Paper PDF
Working Paper

Producing U.S. Population Statistics Using Multiple Administrative Sources

November 2023

Authors: J. David Brown, Marta Murray-Close

Working Paper Number:

CES-23-58

We identify several challenges encountered when constructing U.S. administrative record-based (AR-based) population estimates for 2020. Though the AR estimates are higher than the 2020 Census at the national level, they are over 15 percent lower in 5 percent of counties, suggesting that locational accuracy can be improved. Other challenges include how to achieve comprehensive coverage, maintain consistent coverage across time, filter out nonresidents and people not alive on the reference date, uncover missing links across person and address records, and predict demographic characteristics when multiple ones are reported or when they are missing. We discuss several ways of addressing these issues, e.g., building in redundancy with more sources, linking children to their parents' addresses, and conducting additional record linkage for people without Social Security Numbers and for addresses not initially linked to the Census Bureau's Master Address File. We discuss modeling to predict lower levels of geography for people lacking those geocodes, the probability that a person is a U.S. resident on the reference date, the probability that an address is the person's residence on the reference date, and the probability a person is in each demographic characteristic category. Regression results illustrate how many of these challenges and solutions affect the AR county population estimates.
View Full Paper PDF
Working Paper

Capturing More Than Poverty: School Free and Reduced-Price Lunch Data and Household Income

December 2017

Authors: Quentin Brummet, Sonya R. Porter, Thurston Domina, Nikolas Pharris-Ciurej, Andrew Penner, Emily Penner, Tanya Sanabria

Working Paper Number:

carra-2017-09

Educational researchers often use National School Lunch Program (NSLP) data as a proxy for student poverty. Under NSLP policy, students whose household income is less than 130 percent of the poverty line qualify for free lunch and students whose household income is between 130 percent and 185 percent of the poverty line qualify for reduced-price lunch. Linking school administrative records for all 8th graders in a California public school district to household-level IRS income tax data, we examine how well NSLP data capture student disadvantage. We find both that there is substantial disadvantage in household income not captured by NSLP category data, and that NSLP categories capture disadvantage on test scores above and beyond household income.
View Full Paper PDF
Working Paper

School Discipline and Racial Disparities in Early Adulthood

June 2021

Authors: Sonya R. Porter, Nikolas Pharris-Ciurej, Andrew Penner, Emily Penner, Miles Davison, Evan Rose, Yotam Shem-Tov, Paul Yoo

Working Paper Number:

CES-21-14

Despite interest in the role of school discipline in the creation of racial inequality, previous research has been unable to identify how students who receive suspensions in school differ from unsuspended classmates on key young adult outcomes. We utilize novel data to document the links between high school discipline and important young adult outcomes related to criminal justice contact, social safety net program participation, post-secondary education, and the labor market. We show that the link between school discipline and young adult outcomes tends to be stronger for Black students than for White students, and that inequality in exposure to school discipline accounts for approximately 30 percent of the Black-White disparities in young adult criminal justice outcomes and SNAP receipt.
View Full Paper PDF
Working Paper

Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey

January 2024

Authors: John Voorheis, Jonathan Eggleston, Carl Lieberman, Yarissa Gonzalez, Tim Trudell

Working Paper Number:

CES-24-02

Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations. After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics. This paper is for research purposes only. No changes to production are being implemented at this time.
View Full Paper PDF

The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey

February 2025

Working Paper Number:

CES-25-13

Abstract

Document Tags and Keywords

The 10 most similar working papers to the working paper 'The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey' are listed below in order of similarity.

April 2023

Working Paper Number:

CES-23-21

October 2024

Working Paper Number:

CES-24-58

January 2025

Working Paper Number:

CES-25-03

October 2020

Working Paper Number:

CES-20-33

June 2024

Working Paper Number:

CES-24-27

March 2024

Working Paper Number:

CES-24-18

November 2023

Working Paper Number:

CES-23-58

December 2017

Working Paper Number:

carra-2017-09

June 2021

Working Paper Number:

CES-21-14

January 2024

Working Paper Number:

CES-24-02