-
The 2010 Census Confidentiality Protections Failed, Here's How and Why
December 2023
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. Rodríguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-23-63
Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act.
View Full
Paper PDF
-
Poach or Promote? Job Sorting and Gender Earnings Inequality across U.S. Industries
April 2023
Working Paper Number:
CES-23-23
I outline the sociological theory that would predict that external labor markets ' those in which more positions are filled with new hires rather from firm-internal promotions ' heighten gender based discrimination and contribute to earnings inequality. I test this theory by treating industries as miniature labor markets within the US with varying levels of gender inequality and different hiring practices. Using high quality administrative data from 1985 to 2013, including detailed work histories from this period, I compare the earnings of alike men and women across industries with different levels of reliance on external markets at different times. I find that men experience greater unexplained earnings relative to women ' unexplained in that it is not accounted for by work history or observable demographic characteristics ' when a greater share of earnings increase events occur outside the firm.
View Full
Paper PDF
-
Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report
April 2023
Authors:
J. David Brown,
Danielle H. Sandler,
Lawrence Warren,
Moises Yi,
Misty L. Heggeness,
Joseph L. Schafer,
Matthew Spence,
Marta Murray-Close,
Carl Lieberman,
Genevieve Denoeux,
Lauren Medina
Working Paper Number:
CES-23-21
This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.
View Full
Paper PDF
-
Full Report of the Comparisons of Administrative Record Rosters to Census Self-Responses and NRFU Household Member Responses
March 2023
Working Paper Number:
CES-23-08
One of the U.S. Census Bureau's innovations in the 2020 U.S. Census was the use of administrative records (AR) to create household rosters for enumerating some addresses when a self response was not available but high-quality ARs were. The goal was to reduce the cost of fieldwork during the Nonresponse Followup operation (NRFU). The original plan had NRFU beginning in mid-May and continuing through late July 2020. However, the COVID-19 pandemic forced the delay of NRFU and caused the Internal Revenue Service to postpone the income tax filing deadline, resulting in an interruption in the delivery of ARs to the U.S. Census Bureau. The delays were not anticipated when U.S. Census Bureau staff conducted the research on AR enumeration with the 2010 Census data in preparation for the 2020 Census or during the fine tuning of plans for using ARs during the 2018 End-to-End Census Test. These circumstances raised questions about whether the quality of the AR household rosters was high enough for use in enumeration. To aid in investigating the concern about the quality of the AR rosters, our analyses compared AR rosters to self-response rosters and NRFU household member responses at addresses where both ARs and a self-response were available.
View Full
Paper PDF
-
Business Dynamics Statistics for Single-Unit Firms
December 2022
Working Paper Number:
CES-22-57
The Business Dynamics Statistics of Single Unit Firms (BDS-SU) is an experimental data product that provides information on employment and payroll dynamics for each quarter of the year at businesses that operate in one physical location. This paper describes the creation of the data tables and the value they add to the existing Business Dynamics Statistics (BDS) product. We then present some analysis of the published statistics to provide context for the numbers and demonstrate how they can be used to understand both national and local business conditions, with a particular focus on 2020 and the recession induced by the COVID-19 pandemic. We next examine how firms fared in this recession compared to the Great Recession that began in the fourth quarter of 2007. We also consider the heterogenous impact of the pandemic on various industries and areas of the country, showing which types of businesses in which locations were particularly hard hit. We examine business exit rates in some detail and consider why different metro areas experienced the pandemic in different ways. We also consider entry rates and look for evidence of a surge in new businesses as seen in other data sources. We finish by providing a preview of on-going research to match the BDS to worker demographics and show statistics on the relationship between the characteristics of the firm's workers and outcomes such as firm exit and net job creation.
View Full
Paper PDF
-
Age, Sex, and Racial/Ethnic Disparities and Temporal-Spatial Variation in
Excess All-Cause Mortality During the COVID-19 Pandemic: Evidence from Linked Administrative and Census Bureau Data
May 2022
Working Paper Number:
CES-22-18
Research on the impact of the COVID-19 pandemic in the United States has highlighted substantial racial/ethnic disparities in excess mortality, but reports often differ in the details with respect to the size of these disparities. We suggest that these inconsistencies stem from differences in the temporal scope and measurement of race/ethnicity in existing data. We address these issues using death records for 2010 through 2021 from the Social Security Administration, covering the universe of individuals ever issued a Social Security Number, linked to race/ethnicity responses from the decennial census and American Community Survey. We use these data to (1) estimate excess all-cause mortality at the national level and for age-, sex-, and race/ethnicity-specific subgroups, (2) examine racial/ethnic variation in excess mortality over the course of the pandemic, and (3) explore whether and how racial/ethnic mortality disparities vary across states.
View Full
Paper PDF
-
The Color of Money: Federal vs. Industry Funding of University Research
September 2021
Working Paper Number:
CES-21-26
U.S. universities, which are important producers of new knowledge, have experienced a shift in research funding away from federal and towards private industry sources. This paper compares the effects of federal and private university research funding, using data from 22 universities that include individual-level payments for everyone employed on all grants for each university year and that are linked to patent and Census data, including IRS W-2 records. We instrument for an individual's source of funding with government-wide R&D expenditure shocks within a narrow field of study. We find that a higher share of federal funding causes fewer but more general patents, more high-tech entrepreneurship, a higher likelihood of remaining employed in academia, and a lower likelihood of joining an incumbent firm. Increasing the private share of funding has opposite effects for most outcomes. It appears that private funding leads to greater appropriation of intellectual property by incumbent firms.
View Full
Paper PDF
-
Business Applications as a Leading Economic Indicator?
May 2021
Working Paper Number:
CES-21-09R
How are applications to start new businesses related to aggregate economic activity? This paper explores the properties of three monthly business application series from the U.S. Census Bureau's Business Formation Statistics as economic indicators: all business applications, business applications that are relatively likely to turn into new employer businesses ('likely employers'), and the residual series -- business applications that have a relatively low rate of becoming employers ('likely non-employers'). Growth in applications for likely employers significantly leads total nonfarm employment growth and has a strong positive correlation with it. Furthermore, growth in applications for likely employers leads growth in most of the monthly Principal Federal Economic Indicators (PFEIs). Motivated by our findings, we estimate a dynamic factor model (DFM) to forecast nonfarm employment growth over a 12-month period using the PFEIs and the likely employers series. The latter improves the model's forecast, especially in the years following the turning points of the Great Recession and the COVID-19 pandemic. Overall, applications for likely employers are a strong leading indicator of monthly PFEIs and aggregate economic activity, whereas applications for likely non-employers provide early information about changes in increasingly prevalent self-employment activity in the U.S. economy.
View Full
Paper PDF
-
Changes in Metropolitan Area Definition, 1910-2010
February 2021
Working Paper Number:
CES-21-04
The Census Bureau was established as a permanent agency in 1902, as industrialization and urbanization were bringing about rapid changes in American society. The years following the establishment of a permanent Census Bureau saw the first attempts at devising statistical geography for tabulating statistics for large cities and their environs. These efforts faced several challenges owing to the variation in settlement patterns, political organization, and rates of growth across the United States. The 1910 census proved to be a watershed, as the Census Bureau offered a definition of urban places, established the first census tract boundaries for tabulating data within cities, and introduced the first standardized metropolitan area definition. It was not until the middle of the twentieth century, however, the Census Bureau in association with other statistical agencies had established a flexible standard metropolitan definition and a more consistent means of tabulating urban data. Since 1950, the rules for determining the cores and extent of metropolitan areas have been largely regarded as comparable. In the decades that followed, however, a number of rule changes were put into place that accounted for metropolitan complexity in differing ways, and these have been the cause of some confusion. Changes put into effect with the 2000 census represent a consensus of sorts for how to handle these issues.
View Full
Paper PDF
-
Measuring the Impact of COVID-19 on Businesses and People: Lessons from the Census Bureau's Experience
January 2021
Working Paper Number:
CES-21-02
We provide an overview of Census Bureau activities to enhance the consistency, timeliness, and relevance of our data products in response to the COVID-19 pandemic. We highlight new data products designed to provide timely and granular information on the pandemic's impact: the Small Business Pulse Survey, weekly Business Formation Statistics, the Household Pulse Survey, and Community Resilience Estimates. We describe pandemic-related content introduced to existing surveys such as the Annual Business Survey and the Current Population Survey. We discuss adaptations to ensure the continuity and consistency of existing data products such as principal economic indicators and the American Community Survey.
View Full
Paper PDF