-
Estimating the Graduate Coverage of Post-Secondary Employment Outcomes
September 2025
Working Paper Number:
CES-25-61
This paper proposes a new methodology for estimating the coverage rate of the Post-Secondary Employment Outcomes data product (PSEO), both as a share of new graduates and as a share of total working-age degree holders in the United States. This paper also assesses how representative PSEO is of the broader population of college graduates across an array of institutional and individual characteristics.
View Full
Paper PDF
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census
August 2025
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Michael B. Hawes,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. Rodríguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-25-57
For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act.
View Full
Paper PDF
-
Education and Mortality: Evidence for the Silent Generation from Linked Census and Administrative Data
August 2025
Working Paper Number:
CES-25-56
We quantify the effect of education on mortality using a linkage of the full count 1940, 2000, and 2010 US census files and the Numident death records file. Our sample is composed of children aged 0-18 in 1940, observed living with at least one parent, for whom we can construct a rich set of parental and neighborhood characteristics. We estimate effects of educational attainment in 1940 on survival to 2000, as well as the effects of completed education, observed in 2000, on 10-year survival to 2010. The educational gradients in longevity that we estimate are robust to the inclusion of detailed individual, parental, household, neighborhood and county covariates. Given our full population census sample, we also explore rich patterns of heterogeneity and examine the effect of mediators of the education-mortality relationship. The mediators we consider in this study explain more than half of the relationship between education and mortality. We further show that the mechanisms underlying the education-mortality gradient might be different at different margins of educational attainment.
View Full
Paper PDF
-
Locating Hispanic Americans, 1900-2020
July 2025
Working Paper Number:
CES-25-50
This study examines Hispanic Americans' residential settlement patterns nationwide in the last 120 years. Drawing on newly available neighborhood data for the whole country as early as 1900, it documents the direction and timing of changes in two aspects of their location. First, it charts Hispanics' transition from a predominantly rural population to majority metropolitan by 1930 and also their growing presence in all regions of the U.S. while still maintaining a predominance in the West and Texas. Second, it provides the first evidence of the long-term trajectory of their segregation from whites in the metropolitan areas where they were settling. As shown by studies of more recent decades, Hispanics were never as segregated as African Americans. Nonetheless, similar to African Americans, their segregation from whites increased to high levels through the middle of the century, followed by slow decline. For both groups metropolitan segregation was driven mainly by segregation among central city neighborhoods prior to the 1940s. But new forms of segregation ' a growing city/suburb divide and increasing segregation among suburban places ' have become the largest contributors to segregation today.
View Full
Paper PDF
-
Earnings Measurement Error, Nonresponse and Administrative Mismatch in the CPS
July 2025
Working Paper Number:
CES-25-48
Using the Current Population Survey Annual Social and Economic Supplement matched to Social Security Administration Detailed Earnings Records, we link observations across consecutive years to investigate a relationship between item nonresponse and measurement error in the earnings questions. Linking individuals across consecutive years allows us to observe switching from response to nonresponse and vice versa. We estimate OLS, IV, and finite mixture models that allow for various assumptions separately for men and women. We find that those who respond in both years of the survey exhibit less measurement error than those who respond in one year. Our findings suggest a trade-off between survey response and data quality that should be considered by survey designers, data collectors, and data users.
View Full
Paper PDF
-
The Rural/Urban Volunteering Divide
June 2025
Working Paper Number:
CES-25-42
Are rural residents more likely to volunteer than those living in urban places? Although early sociological theory posited that rural residents were more likely to experience social bonds connecting them to their community, increasing their odds of volunteer engagement, empirical support is limited. Drawing upon the full population of rural and urban respondents to the United States Census Bureau's Current Population Survey (CPS) Volunteering Supplement (2002-2015), we found that rural respondents are more likely to report volunteering compared to urban respondents, although these differences are decreasing over time. Moreover, we found that propensities for rural and urban volunteerism vary based on differences in both individual and place-based characteristics; further, the size of these effects differ across rural and urban places. These findings have important implications for theory and empirical analysis.
View Full
Paper PDF
-
The Decline of Volunteering in the United States: Is it the Economy?
June 2025
Working Paper Number:
CES-25-41
This article investigates the complex interactions between local and national economic contexts and volunteering behavior. We examine three dimensions of local economic context'economic disadvantage (e.g., the percentage of families living in poverty), income inequality, and economic growth (e.g., the change in median household income) and the impact of a national/global economic jolt'the Great Recession. Analysis of data from the Current Population Survey's (CPS) Volunteering Supplement (2002-2015) reveals. Individuals who live in places characterized by economic disadvantage and economic inequality are less likely to volunteer than individuals in more advantaged, equitable communities. The recession had a dampening effect on volunteering overall, but it had the largest dampening effect on individual volunteering in communities with above average rates of income equality and higher rates of economic growth. While individuals living in rural communities were more likely to volunteer than their urban counterparts before the recession, rural/urban differences disappear after the recession.
View Full
Paper PDF
-
Geographic Immobility in the United States: Assessing the Prevalence and Characteristics of Those Who Never Migrate Across State Lines Using Linked Federal Tax Microdata
March 2025
Working Paper Number:
CES-25-19
This paper explores the prevalence and characteristics of those who never migrate at the state scale in the U.S. Studying people who never migrate requires regular and frequent observation of their residential location for a lifetime, or at least for many years. A novel U.S. population-sized longitudinal dataset that links individual level Internal Revenue Service (IRS) and Social Security Administration (SSA) administrative records supplies this information annually, along with information on income and socio-demographic characteristics. We use these administrative microdata to follow a cohort aged between 15 and 50 in 2001 from 2001 to 2016, differentiating those who lived in the same state every year during this period (i.e., never made an interstate move) from those who lived in more than one state (i.e., made at least one interstate move). We find those who never made an interstate move comprised 75 percent of the total population of this age cohort. This percentage varies by year of age but never falls below 62 percent even for those who were teenagers or young adults in 2001. There are also variations in these percentages by sex, race, nativity, and income, with the latter having the largest effects. We also find substantial variation in these percentages across states. Our findings suggest a need for more research on geographically immobile populations in U.S.
View Full
Paper PDF
-
The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey
February 2025
Working Paper Number:
CES-25-13
The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.
View Full
Paper PDF
-
Geographic Disparities in Alzheimer's Disease and Related Dementia Mortality in the US: Comparing Impacts of Place of Birth and Place of Residence
January 2025
Working Paper Number:
CES-25-11
Objective: Building on the hypothesis that early-life exposures might influence the onset of Alzheimer's Disease and Related Dementia (ADRD), this study delves into geographic variations in ADRD mortality in the US. By considering both state of residence and state of birth, we aim to discern the comparative significance of these geospatial factors.
Methods: We conducted a secondary data analysis of the National Longitudinal Mortality Study (NLMS), that has 3.5 million records from 1973-2011 and over 0.5 million deaths. We focused on individuals born in or before 1930, tracked in NLMS cohorts from 1979-2000. Employing multi-level logistic regression, with individuals nested within states of residence and/or states of birth, we assessed the role of geographical factors in ADRD mortality variation.
Results: We found that both state of birth and state of residence account for a modest portion of ADRD mortality variation. Specifically, state of residence explains 1.19% of the total variation in ADRD mortality, whereas state of birth explains only 0.6%. When combined, both state of residence and state of birth account for only 1.05% of the variation, suggesting state of residence could matter more in ADRD mortality outcomes.
Conclusion: Findings of this study suggest that state of residence explains more variation in ADRD mortality than state of birth. These results indicate that factors in later life may present more impactful intervention points for curbing ADRD mortality. While early-life environmental exposures remain relevant, their role as primary determinants of ADRD in later life appears to be less pronounced in this study.
View Full
Paper PDF