CREAT - Census Bureau

Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report

April 2023

Written by: J. David Brown, Danielle H. Sandler, Lawrence Warren, Moises Yi, Misty L. Heggeness, Joseph L. Schafer, Matthew Spence, Marta Murray-Close, Carl Lieberman, Genevieve Denoeux, Lauren Medina

Working Paper Number:

CES-23-21

Abstract

This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.

Document Tags and Keywords

Keywords:

census data, survey, respondent, ethnicity, hispanic, mexican, immigrant, imputation, immigration, citizen, use census, resident, residence, reside, census household, eligibility, eligible, citizenship, census responses, 1040

Tags:

Internal Revenue Service, Social Security Administration, Ordinary Least Squares, Administrative Records, Office of Management and Budget, Current Population Survey, Housing and Urban Development, Department of Justice, Survey of Income and Program Participation, Social Security, Postal Service, Department of Homeland Security, Department of Housing and Urban Development, American Community Survey, Social Security Number, Cornell Institute for Social and Economic Research, Master Beneficiary Record, Disability Insurance, Protected Identification Key, American Housing Survey, Computer Assisted Personal Interview, Medicaid Services, Centers for Medicare, Temporary Assistance for Needy Families, NUMIDENT, Master Address File, Census Bureau Master Address File, Census Bureau Disclosure Review Board, 2010 Census, Disclosure Review Board, Indian Health Service, Person Validation System, Indian Housing Information Center, Supplemental Nutrition Assistance Program, Person Identification Validation System, Individual Taxpayer Identification Numbers, MAFID, Census Edited File, Social Science Research Institute, Personally Identifiable Information, Some Other Race, Data Management System, Census Household Composition Key, Citizenship and Immigration Services

Similar Working Papers

The 10 most similar working papers to the working paper 'Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report' are listed below in order of similarity.

Working Paper
🔥

Determination of the 2020 U.S. Citizen Voting Age Population (CVAP) Using Administrative Records and Statistical Methodology Technical Report

October 2020

Authors: John M. Abowd, J. David Brown, Lawrence Warren, Moises Yi, Misty L. Heggeness, William R. Bell, Michael B. Hawes, Andrew Keller, Vincent T. Mule Jr., Joseph L. Schafer, Matthew Spence

Working Paper Number:

CES-20-33

This report documents the efforts of the Census Bureau's Citizen Voting-Age Population (CVAP) Internal Expert Panel (IEP) and Technical Working Group (TWG) toward the use of multiple data sources to produce block-level statistics on the citizen voting-age population for use in enforcing the Voting Rights Act. It describes the administrative, survey, and census data sources used, and the four approaches developed for combining these data to produce CVAP estimates. It also discusses other aspects of the estimation process, including how records were linked across the multiple data sources, and the measures taken to protect the confidentiality of the data.
View Full Paper PDF
Working Paper
🔥

Understanding the Quality of Alternative Citizenship Data Sources for the 2020 Census

August 2018

Authors: J. David Brown, Lawrence Warren, Moises Yi, Misty L. Heggeness, Suzanne M. Dorinski

Working Paper Number:

CES-18-38R

This paper examines the quality of citizenship data in self-reported survey responses compared to administrative records and evaluates options for constructing an accurate count of resident U.S. citizens. Person-level discrepancies between survey-collected citizenship data and administrative records are more pervasive than previously reported in studies comparing survey and administrative data aggregates. Our results imply that survey-sourced citizenship data produce significantly lower estimates of the noncitizen share of the population than would be produced from currently available administrative records; both the survey-sourced and administrative data have shortcomings that could contribute to this difference. Our evidence is consistent with noncitizen respondents misreporting their own citizenship status and failing to report that of other household members. At the same time, currently available administrative records may miss some naturalizations and capture others with a delay. The evidence in this paper also suggests that adding a citizenship question to the 2020 Census would lead to lower self-response rates in households potentially containing noncitizens, resulting in higher fieldwork costs and a lower-quality population count.
View Full Paper PDF
Working Paper
🔥

Producing U.S. Population Statistics Using Multiple Administrative Sources

November 2023

Authors: J. David Brown, Marta Murray-Close

Working Paper Number:

CES-23-58

We identify several challenges encountered when constructing U.S. administrative record-based (AR-based) population estimates for 2020. Though the AR estimates are higher than the 2020 Census at the national level, they are over 15 percent lower in 5 percent of counties, suggesting that locational accuracy can be improved. Other challenges include how to achieve comprehensive coverage, maintain consistent coverage across time, filter out nonresidents and people not alive on the reference date, uncover missing links across person and address records, and predict demographic characteristics when multiple ones are reported or when they are missing. We discuss several ways of addressing these issues, e.g., building in redundancy with more sources, linking children to their parents' addresses, and conducting additional record linkage for people without Social Security Numbers and for addresses not initially linked to the Census Bureau's Master Address File. We discuss modeling to predict lower levels of geography for people lacking those geocodes, the probability that a person is a U.S. resident on the reference date, the probability that an address is the person's residence on the reference date, and the probability a person is in each demographic characteristic category. Regression results illustrate how many of these challenges and solutions affect the AR county population estimates.
View Full Paper PDF
Working Paper
🔥

Predicting the Effect of Adding a Citizenship Question to the 2020 Census

June 2019

Authors: J. David Brown, Lawrence Warren, Moises Yi, Misty L. Heggeness, Suzanne M. Dorinski

Working Paper Number:

CES-19-18

The addition of a citizenship question to the 2020 census could affect the self-response rate, a key driver of the cost and quality of a census. We find that citizenship question response patterns in the American Community Survey (ACS) suggest that it is a sensitive question when asked about administrative record noncitizens but not when asked about administrative record citizens. ACS respondents who were administrative record noncitizens in 2017 frequently choose to skip the question or answer that the person is a citizen. We predict the effect on self-response to the entire survey by comparing mail response rates in the 2010 ACS, which included a citizenship question, with those of the 2010 census, which did not have a citizenship question, among households in both surveys. We compare the actual ACS-census difference in response rates for households that may contain noncitizens (more sensitive to the question) with the difference for households containing only U.S. citizens. We estimate that the addition of a citizenship question will have an 8.0 percentage point larger effect on self-response rates in households that may have noncitizens relative to those with only U.S. citizens. Assuming that the citizenship question does not affect unit self-response in all-citizen households and applying the 8.0 percentage point drop to the 28.1 % of housing units potentially having at least one noncitizen would predict an overall 2.2 percentage point drop in self-response in the 2020 census, increasing costs and reducing the quality of the population count.
View Full Paper PDF
Working Paper
🔥

Noncitizen Coverage and Its Effects on U.S. Population Statistics

August 2023

Authors: J. David Brown, Misty L. Heggeness, Marta Murray-Close

Working Paper Number:

CES-23-42

We produce population estimates with the same reference date, April 1, 2020, as the 2020 Census of Population and Housing by combining 31 types of administrative record (AR) and third-party sources, including several new to the Census Bureau with a focus on noncitizens. Our AR census national population estimate is higher than other Census Bureau official estimates: 1.8% greater than the 2020 Demographic Analysis high estimate, 3.0% more than the 2020 Census count, and 3.6% higher than the vintage-2020 Population Estimates Program estimate. Our analysis suggests that inclusion of more noncitizens, especially those with unknown legal status, explains the higher AR census estimate. About 19.8% of AR census noncitizens have addresses that cannot be linked to an address in the 2020 Census collection universe, compared to 5.7% of citizens, raising the possibility that the 2020 Census did not collect data for a significant fraction of noncitizens residing in the United States under the residency criteria used for the census. We show differences in estimates by age, sex, Hispanic origin, geography, and socioeconomic characteristics symptomatic of the differences in noncitizen coverage.
View Full Paper PDF
Working Paper
🔥

The Nature of the Bias When Studying Only Linkable Person Records: Evidence from the American Community Survey

April 2014

Authors: Adela Luque, J. David Brown, Brittany Bond, Amy B. O'Hara, Amy OHara

Working Paper Number:

carra-2014-08

Record linkage across survey and administrative records sources can greatly enrich data and improve their quality. The linkage can reduce respondent burden and nonresponse follow-up costs. This is particularly important in an era of declining survey response rates and tight budgets. Record linkage also creates statistical bias, however. The U.S. Census Bureau links person records through its Person Identification Validation System (PVS), assigning each record a Protected Identification Key (PIK). It is not possible to reliably assign a PIK to every record, either due to insufficient identifying information or because the information does not uniquely match any of the administrative records used in the person validation process. Non-random ability to assign a PIK can potentially inject bias into statistics using linked data. This paper studies the nature of this bias using the 2009 and 2010 American Community Survey (ACS). The ACS is well-suited for this analysis, as it contains a rich set of person characteristics that can describe the bias. We estimate probit models for whether a record is assigned a PIK. The results suggest that young children, minorities, residents of group quarters, immigrants, recent movers, low-income individuals, and non-employed individuals are less likely to receive a PIK using 2009 ACS. Changes to the PVS process in 2010 significantly addressed the young children deficit, attenuated the other biases, and increased the validated records share from 88.1 to 92.6 percent (person-weighted).
View Full Paper PDF
Working Paper

Coverage and Agreement of Administrative Records and 2010 American Community Survey Demographic Data

November 2014

Authors: Adela Luque, Renuka Bhaskar, Sonya Rastogi, James M. Noon

Working Paper Number:

carra-2014-14

The U.S. Census Bureau is researching possible uses of administrative records in decennial census and survey operations. The 2010 Census Match Study and American Community Survey (ACS) Match Study represent recent efforts by the Census Bureau to evaluate the extent to which administrative records provide data on persons and addresses in the 2010 Census and 2010 ACS. The 2010 Census Match Study also examines demographic response data collected in administrative records. Building on this analysis, we match data from the 2010 ACS to federal administrative records and third party data as well as to previous census data and examine administrative records coverage and agreement of ACS age, sex, race, and Hispanic origin responses. We find high levels of coverage and agreement for sex and age responses and variable coverage and agreement across race and Hispanic origin groups. These results are similar to findings from the 2010 Census Match Study.
View Full Paper PDF
Working Paper

Full Report of the Comparisons of Administrative Record Rosters to Census Self-Responses and NRFU Household Member Responses

March 2023

Authors: Cristina Tello-Trillo, Andrew Keller, Mary H. Mulry, Thomas Mule

Working Paper Number:

CES-23-08

One of the U.S. Census Bureau's innovations in the 2020 U.S. Census was the use of administrative records (AR) to create household rosters for enumerating some addresses when a self response was not available but high-quality ARs were. The goal was to reduce the cost of fieldwork during the Nonresponse Followup operation (NRFU). The original plan had NRFU beginning in mid-May and continuing through late July 2020. However, the COVID-19 pandemic forced the delay of NRFU and caused the Internal Revenue Service to postpone the income tax filing deadline, resulting in an interruption in the delivery of ARs to the U.S. Census Bureau. The delays were not anticipated when U.S. Census Bureau staff conducted the research on AR enumeration with the 2010 Census data in preparation for the 2020 Census or during the fine tuning of plans for using ARs during the 2018 End-to-End Census Test. These circumstances raised questions about whether the quality of the AR household rosters was high enough for use in enumeration. To aid in investigating the concern about the quality of the AR rosters, our analyses compared AR rosters to self-response rosters and NRFU household member responses at addresses where both ARs and a self-response were available.
View Full Paper PDF
Working Paper

The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey

February 2025

Authors: Jonathan Eggleston, Mark Klee, Linden McBride

Working Paper Number:

CES-25-13

The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.
View Full Paper PDF
Working Paper

Incorporating Administrative Data in Survey Weights for the 2018-2022 Survey of Income and Program Participation

October 2024

Authors: Jonathan Eggleston, Julia Yang

Working Paper Number:

CES-24-58

Response rates to the Survey of Income and Program Participation (SIPP) have declined over time, raising the potential for nonresponse bias in survey estimates. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we modify various parts of the SIPP weighting algorithm to incorporate such data. We create these new weights for the 2018 through 2022 SIPP panels and examine how the new weights affect survey estimates. Our results show that before weighting adjustments, SIPP respondents in these panels have higher socioeconomic status than the general population. Existing weighting procedures reduce many of these differences. Comparing SIPP estimates between the production weights and the administrative data-based weights yields changes that are not uniform across the joint income and program participation distribution. Unlike other Census Bureau household surveys, there is no large increase in nonresponse bias in SIPP due to the COVID-19 Pandemic. In summary, the magnitude and sign of nonresponse bias in SIPP is complicated, and the existing weighting procedures may change the sign of nonresponse bias for households with certain incomes and program benefit statuses.
View Full Paper PDF

Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report

April 2023

Working Paper Number:

CES-23-21

Abstract

Document Tags and Keywords

The 10 most similar working papers to the working paper 'Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report' are listed below in order of similarity.

October 2020

Working Paper Number:

CES-20-33

August 2018

Working Paper Number:

CES-18-38R

November 2023

Working Paper Number:

CES-23-58

June 2019

Working Paper Number:

CES-19-18

August 2023

Working Paper Number:

CES-23-42

April 2014

Working Paper Number:

carra-2014-08

November 2014

Working Paper Number:

carra-2014-14

March 2023

Working Paper Number:

CES-23-08

February 2025

Working Paper Number:

CES-25-13

October 2024

Working Paper Number:

CES-24-58