This paper introduces the Gridded Environmental Impacts Frame (Gridded EIF), a novel privacy-protected dataset derived from the U.S. Census Bureau's confidential Environmental Impacts Frame (EIF) microdata infrastructure. The EIF combines comprehensive administrative records and survey data on the U.S. population with high-resolution geospatial information on environmental hazards. While access to the EIF is restricted due to the confidential nature of the underlying data, the Gridded EIF offers a broader research community the opportunity to glean insights from the data while preserving confidentiality. We describe the data and privacy protection process, and offer guidance on appropriate usage, presenting practical applications.
-
Building the Prototype Census Environmental Impacts Frame
April 2023
Working Paper Number:
CES-23-20
The natural environment is central to all aspects of life, but efforts to quantify its influence have been hindered by data availability and measurement constraints. To mitigate some of these challenges, we introduce a new prototype of a microdata infras tructure: the Census Environmental Impacts Frame (EIF). The EIF provides detailed individual-level information on demographics, economic characteristics, and address level histories ' linked to spatially and temporally resolved estimates of environmental conditions for each individual ' for almost every resident in the United States over the past two decades. This linked microdata infrastructure provides a unique platform for advancing our understanding about the distribution of environmental amenities and hazards, when, how, and why exposures have evolved over time, and the consequences of environmental inequality and changing environmental conditions. We describe the construction of the EIF, explore issues of coverage and data quality, document patterns and trends in individual exposure to two correlated but distinct air pollutants as an application of the EIF, and discuss implications and opportunities for future research.
View Full
Paper PDF
-
The Census Historical Environmental Impacts Frame
October 2024
Working Paper Number:
CES-24-66
The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
View Full
Paper PDF
-
Income, Wealth, and Environmental Inequality in the United States
October 2024
Working Paper Number:
CES-24-57
This paper explores the relationships between air pollution, income, wealth, and race by combining administrative data from U.S. tax returns between 1979'2016, various measures of air pollution, and sociodemographic information from linked survey and administrative data. In the first year of our data, the relationship between income and ambient pollution levels nationally is approximately zero for both non-Hispanic White and Black individuals. However, at every single percentile of the national income distribution, Black individuals are exposed to, on average, higher levels of pollution than White individuals. By 2016, the relationship between income and air pollution had steepened, primarily for Black individuals, driven by changes in where rich and poor Black individuals live. We utilize quasi-random shocks to income to examine the causal effect of changes in income and wealth on pollution exposure over a five year horizon, finding that these income'pollution elasticities map closely to the values implied by our descriptive patterns. We calculate that Black-White differences in income can explain ~10 percent of the observed gap in air pollution levels in 2016.
View Full
Paper PDF
-
Longitudinal Environmental Inequality and Environmental Gentrification: Who Gains From Cleaner Air?
May 2017
Working Paper Number:
carra-2017-04
A vast empirical literature has convincingly shown that there is pervasive cross-sectional inequality in exposure to environmental hazards. However, less is known about how these inequalities have been evolving over time. I fill this gap by creating a new dataset, which combines satellite data on ground-level concentrations of fine particulate matter with linked administrative and survey data. This linked dataset allows me to measure individual pollution exposure for over 100 million individuals in each year between 2000 and 2014, a period of time has seen substantial improvements in average air quality. This rich dataset can then be used to analyze longitudinal dimensions of environmental inequality by examining the distribution of changes in individual pollution exposure that underlie these aggregate improvements. I confirm previous findings that cross-sectional environmental inequality has been on the decline, but I argue that this may miss longitudinal patterns in exposure that are consistent with environmental gentrification. I find that advantaged individuals at the beginning of the sample experience larger pollution exposure reductions than do initially disadvantaged individuals.
View Full
Paper PDF
-
Mobility, Opportunity, and Volatility Statistics (MOVS):
Infrastructure Files and Public Use Data
April 2024
Working Paper Number:
CES-24-23
Federal statistical agencies and policymakers have identified a need for integrated systems of household and personal income statistics. This interest marks a recognition that aggregated measures of income, such as GDP or average income growth, tell an incomplete story that may conceal large gaps in well-being between different types of individuals and families. Until recently, longitudinal income data that are rich enough to calculate detailed income statistics and include demographic characteristics, such as race and ethnicity, have not been available. The Mobility, Opportunity, and Volatility Statistics project (MOVS) fills this gap in comprehensive income statistics. Using linked demographic and tax records on the population of U.S. working-age adults, the MOVS project defines households and calculates household income, applying an equivalence scale to create a personal income concept, and then traces the progress of individuals' incomes over time. We then output a set of intermediate statistics by race-ethnicity group, sex, year, base-year state of residence, and base-year income decile. We select the intermediate statistics most useful in developing more complex intragenerational income mobility measures, such as transition matrices, income growth curves, and variance-based volatility statistics. We provide these intermediate statistics as part of a publicly released data tool with downloadable flat files and accompanying documentation. This paper describes the data build process and the output files, including a brief analysis highlighting the structure and content of our main statistics.
View Full
Paper PDF
-
Peer Income Exposure Across the Income Distribution
February 2025
Working Paper Number:
CES-25-16
Children from families across the income distribution attend public schools, making schools and classrooms potential sites for interaction between more- and less-affluent children. However, limited information exists regarding the extent of economic integration in these contexts. We merge educational administrative data from Oregon with measures of family income derived from IRS records to document student exposure to economically diverse school and classroom peers. Our findings indicate that affluent children in public schools are relatively isolated from their less affluent peers, while low- and middle-income students experience relatively even peer income distributions. Students from families in the top percentile of the income distribution attend schools where 20 percent of their peers, on average, come from the top five income percentiles. A large majority of the differences in peer exposure that we observe arise from the sorting of students across schools; sorting across classrooms within schools plays a substantially smaller role.
View Full
Paper PDF
-
Race and Mobility in U.S. Marriage Markets: Quantifying the Role of Segregation
December 2022
Working Paper Number:
CES-22-59R
We examine racial disparities in upward intergenerational mobility of family income by linking American Community Survey respondents born in 1978-87 to their parents' tax records. This linkage facilitates better measurement of marriage-market processes than tax records alone. Relative to White individuals, we document lower upward mobility of partner income for Black, Hispanic, and Asian individuals. These disparities offset Asian women and men's advantages in personal income mobility, overturn Black women's small advantage, and compound Black men's disadvantage. We develop a novel nonparametric decomposition which reveals that these disparities are driven primarily by racial differences in marriage-market opportunities, but also by different partnering rates conditional on opportunities. We then apply a selection-correction methodology to estimate causal effects of childhood exposure to racial segregation. Our design approximates a shift in the current generation's segregation exposure, holding historical exposures constant. This channel generates substantial Black-White intergenerational mobility gaps across all income measures, and we show that these effects cumulate over a multigenerational horizon.
View Full
Paper PDF
-
Structural versus Ethnic Dimensions of Housing Segregation
March 2016
Working Paper Number:
CES-16-22
Racial residential segregation is still very high in many American cities. Some portion of segregation is attributable to socioeconomic differences across racial lines; some portion is caused by purely racial factors, such as preferences about the racial composition of one's neighborhood or discrimination in the housing market. Social scientists have had great difficulty disaggregating segregation into a portion that can be explained by interracial differences in socioeconomic characteristics (what we call structural factors) versus a portion attributable to racial and ethnic factors. What would such a measure look like? In this paper, we draw on a new source of data to develop an innovative structural segregation measure that shows the amount of segregation that would remain if we could assign households to housing units based only on non-racial socioeconomic characteristics. This inquiry provides vital building blocks for the broader enterprise of understanding and remedying housing segregation.
View Full
Paper PDF
-
An In-Depth Examination of Requirements for Disclosure Risk Assessment
October 2023
Authors:
Ron Jarmin,
John M. Abowd,
Ian M. Schmutte,
Jerome P. Reiter,
Nathan Goldschlag,
Victoria A. Velkoff,
Michael B. Hawes,
Robert Ashmead,
Ryan Cumings-Menon,
Sallie Ann Keller,
Daniel Kifer,
Philip Leclerc,
Rolando A. RodrÃguez,
Pavel Zhuravlev
Working Paper Number:
CES-23-49
The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
View Full
Paper PDF
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census
August 2025
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Michael B. Hawes,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. RodrÃguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-25-57
For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act.
View Full
Paper PDF