By exploiting establishment-level data for U.S. manufacturing, this paper sheds new light on the source of the changes in the structure of production, wages, and employment that have occurred over the last several decades. Based on recent theoretical work by Caselli (1999) and Kremer and Maskin (1996), we focus on empirically investigating the following two hypotheses. The first hypothesis is that the channel through which skill biased technical change works through the economy is via changes in the dispersion in wages and productivity across establishments. The second is that the increased dispersion in wages and productivity across establishments is linked to differential rates of technological adoption across establishments. We find empirical support for these hypotheses. Our main findings are that (1) the between plant component of wage dispersion is an important and growing part of total wage dispersion, (2) much of the between plant increase in dispersion is within industries, (3) the between plant measures of wage and productivity dispersion have indreased substantially over the last few decades, (4) industries with large changes in between plant wage dispersion also exhibit large changes n between plant productivity dispersion, (5) a substantial fraction of the rising dispersion in wages and productivity is accounted for by increasing wage and productivity differentials across high and low computer investment per worker plants and high and low capital intensity plants, and (6) Changes in dispersion accounted for by such observable characteristics yield predicted industry level changes in wage and productivity dispersion that are highly correlated.
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census
August 2025
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Michael B. Hawes,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. RodrÃguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-25-57
For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act.
View Full
Paper PDF
-
Separate but Not Equal: The Uneven Cost of Residential Segregation for Network-Based Hiring
October 2024
Working Paper Number:
CES-24-56
This paper studies how residential segregation by race and by education affects job search via neighbor networks. Using confidential microdata from the US Census Bureau, I measure segregation for each characteristic at both the individual level and the neighborhood level. My findings are manifold. At the individual level, future coworkership with new neighbors on the same block is less likely among segregated individuals than among integrated workers, irrespective of races and levels of schooling. The impacts are most adverse for the most socioeconomically disadvantaged demographics: Blacks and those without a high school education. At the block level, however, higher segregation along either dimension raises the likelihood of any future coworkership on the block for all racial or educational groups. My identification strategy, capitalizing on data granularity, allows a causal interpretation of these results. Together, they point to the coexistence of homophily and in-group competition for job opportunities in linking residential segregation to neighbor-based informal hiring. My subtle findings have important implications for policy-making.
View Full
Paper PDF
-
What Caused Racial Disparities in Particulate Exposure to Fall? New Evidence from the Clean Air Act and Satellite-Based Measures of Air Quality
January 2020
Working Paper Number:
CES-20-02
Racial differences in exposure to ambient air pollution have declined significantly in the United States over the past 20 years. This project links restricted-access Census Bureau microdata to newly available, spatially continuous high resolution measures of ambient particulate pollution (PM2.5) to examine the underlying causes and consequences of differences in black-white pollution exposures. We begin by decomposing differences in pollution exposure into components explained by observable population characteristics (e.g., income) versus those that remain unexplained. We then use quantile regression methods to show that a significant portion of the 'unexplained' convergence in black-white pollution exposure can be attributed to differential impacts of the Clean Air Act (CAA) in non-Hispanic African American and non-Hispanic white communities. Areas with larger black populations saw greater CAA-related declines in PM2.5 exposure. We show that the CAA has been the single largest contributor to racial convergence in PM2.5 pollution exposure in the U.S. since 2000 accounting for over 60 percent of the reduction.
View Full
Paper PDF
-
Earnings Inequality and Mobility Trends in the United States: Nationally Representative Estimates from Longitudinally Linked Employer-Employee Data
January 2017
Working Paper Number:
CES-17-24
Using earnings data from the U.S. Census Bureau, this paper analyzes the role of the employer in explaining the rise in earnings inequality in the United States. We first establish a consistent frame of analysis appropriate for administrative data used to study earnings inequality. We show that the trends in earnings inequality in the administrative data from the Longitudinal Employer-Household Dynamics Program are inconsistent with other data sources when we do not correct for the presence of misused SSNs. After this correction to the worker frame, we analyze how the earnings distribution has changed in the last decade. We present a decomposition of the year-to-year changes in the earnings distribution from 2004-2013. Even when simplifying these flows to movements between the bottom 20%, the middle 60% and the top 20% of the earnings distribution, about 20.5 million workers undergo a transition each year. Another 19.9 million move between employment and nonemployment. To understand the role of the firm in these transitions, we estimate a model for log earnings with additive fixed worker and firm effects using all jobs held by eligible workers from 2004-2013. We construct a composite log earnings firm component across all jobs for a worker in a given year and a non-firm component. We also construct a skill-type index. We show that, while the difference between working at a low-or middle-paying firm are relatively small, the gains from working at a top-paying firm are large. Specifically, the benefits of working for a high-paying firm are not only realized today, through higher earnings paid to the worker, but also persist through an increase in the probability of upward mobility. High-paying firms facilitate moving workers to the top of the earnings distribution and keeping them there.
View Full
Paper PDF
-
Two Perspectives on Commuting: A Comparison of Home to Work Flows Across Job-Linked Survey and Administrative Files
January 2017
Working Paper Number:
CES-17-34
Commuting flows and workplace employment data have a wide constituency of users including urban and regional planners, social science and transportation researchers, and businesses. The U.S. Census Bureau releases two, national data products that give the magnitude and characteristics of home to work flows. The American Community Survey (ACS) tabulates households' responses on employment, workplace, and commuting behavior. The Longitudinal Employer-Household Dynamics (LEHD) program tabulates administrative records on jobs in the LEHD Origin-Destination Employment Statistics (LODES). Design differences across the datasets lead to divergence in a comparable statistic: county-to-county aggregate commute flows. To understand differences in the public use data, this study compares ACS and LEHD source files, using identifying information and probabilistic matching to join person and job records. In our assessment, we compare commuting statistics for job frames linked on person, employment status, employer, and workplace and we identify person and job characteristics as well as design features of the data frames that explain aggregate differences. We find a lower rate of within-county commuting and farther commutes in LODES. We attribute these greater distances to differences in workplace reporting and to uncertainty of establishment assignments in LEHD for workers at multi-unit employers. Minor contributing factors include differences in residence location and ACS workplace edits. The results of this analysis and the data infrastructure developed will support further work to understand and enhance commuting statistics in both datasets.
View Full
Paper PDF
-
Fighting Fire with Fire(fighting Foam): The Long Run Effects of PFAS Use at U.S. Military Installations
December 2024
Working Paper Number:
CES-24-72
Tens of millions of people in the U.S. may be exposed to drinking water contaminated with perand poly-fluoroalkyl chemicals (PFAS). We provide the first estimates of long-run economic costs from a major, early PFAS source: fire-fighting foam. We combine the timing of its adoption with variation in the presence of fire training areas at U.S. military installations in the 1970s to estimate exposure effects for millions of individuals using natality records and restricted administrative data. We document diminished birthweights, college attendance, and earnings, illustrating a pollution externality from military training and unregulated chemicals as a determinant of economic opportunity.
View Full
Paper PDF
-
Earnings Inequality and Immobility for Hispanics and Asians: An Examination of Variation Across Subgroups
September 2021
Working Paper Number:
CES-21-30
Our analysis provides the rst disaggregated examination of earnings inequality and immobility within the Hispanic ethnic group and the Asian race group in the U.S. over the period of 2005-2015. Our analysis differentiates between long-term immigrant and native-born Hispanics and Asians relative to recent immigrants to the U.S. (post 2005) and new labor market entrants. Our results show that for the Asian and Hispanic population aged 18-45, earnings inequality is constant or slightly decreasing for the long-term immigrant and native-born populations. However, including new labor market entrants and recent immigrants to the U.S. contributes significantly to the earnings inequality for these groups at both the aggregate and disaggregated race or ethnic group levels. These findings have important implications for the measurement of inequality for racial and ethnic groups that have higher proportions of new immigrants and new labor market entrants in the U.S.
View Full
Paper PDF
-
Immigration and Local Business Dynamics:
Evidence from U.S. Firms
August 2021
Working Paper Number:
CES-21-18
This paper finds that establishment entry and exit'particularly the prevention of establishment exit'drive immigrant absorption and immigrant-induced productivity increases in U.S. local industries. Using a comprehensive collection of confidential survey and administrative
data from the Census Bureau, it shows that inflows of immigrantworkers lead to more establishment entry and less establishment exit in local industries. These relationships are responsible for nearly all of long-run immigrant-induced job creation, with 78 percent accounted for by exit prevention alone, leaving a minimal role for continuing establishment expansion. Furthermore, exit prevention is not uniform: immigrant inflows increase the probability of exit by establishments from low productivity firms and decrease the probability of exit by establishments from high productivity firms. As a result, the increase in establishment count is concentrated at the top of the productivity distribution. A general equilibrium model proposes a mechanism that ties immigrantworkers to high productivity firms and shows how accounting for changes to the firm productivity distribution can yield substantially larger estimates of immigrant-generated economic surplus than canonical models of labor demand.
View Full
Paper PDF
-
The Distributional Effects of an Investment-Based Social Security System
April 2002
Working Paper Number:
CES-02-08
In this paper we study the distributional impact of a change from the existing pay-as-you-go Social Security system to one that combines both pay-as-you-go and investment-based elements. Such a transition can avert the large tax increases that would otherwise be necessary to maintain the level of benefits promised under current law as life expectancy increases. According to the Social Security actuaries (Board of Trustees, 1999), retaining the existing pay-as-you-go system would eventually require raising the current 12.4 percent Social Security payroll tax rate to about 19 percent to maintain the current benefit rules or cutting benefits by more than one-third in order to avoid a tax increase. In contrast, previous research showed that adding an investment-based component with savings equal to two percent of covered earnings to the existing 12.4 percent pay-as-you-go system would be sufficient to maintain the benefits promised under current rules without any increase in tax rates (Feldstein and Samwick 1997, 1998a, 1998b).
View Full
Paper PDF
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report
December 2023
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. RodrÃguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-23-63R
For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act. You are reading the full technical report. For the summary paper see https://doi.org/10.1162/99608f92.4a1ebf70.
View Full
Paper PDF