CREAT - Census Bureau

The Privacy-Protected Gridded Environmental Impacts Frame

December 2024

Written by: John Voorheis, Eva Lyubich, Jonathan Colmer, Kendall Houghton, Mary Munro, Cameron Scalera, Jennifer Withrow

Working Paper Number:

CES-24-74

Abstract

This paper introduces the Gridded Environmental Impacts Frame (Gridded EIF), a novel privacy-protected dataset derived from the U.S. Census Bureau's confidential Environmental Impacts Frame (EIF) microdata infrastructure. The EIF combines comprehensive administrative records and survey data on the U.S. population with high-resolution geospatial information on environmental hazards. While access to the EIF is restricted due to the confidential nature of the underlying data, the Gridded EIF offers a broader research community the opportunity to glean insights from the data while preserving confidentiality. We describe the data and privacy protection process, and offer guidance on appropriate usage, presenting practical applications.

Document Tags and Keywords

Keywords:

data, microdata, disclosure, agency, confidentiality, privacy, environmental, record, security, census bureau, geographic, datasets, public, publicly, 1040

Tags:

Internal Revenue Service, National Science Foundation, American Community Survey, Russell Sage Foundation, Sloan Foundation, Census Bureau Disclosure Review Board, Disclosure Review Board, Adjusted Gross Income, Data Management System

Similar Working Papers

The 10 most similar working papers to the working paper 'The Privacy-Protected Gridded Environmental Impacts Frame' are listed below in order of similarity.

Working Paper
🔥

Building the Prototype Census Environmental Impacts Frame

April 2023

Authors: John Voorheis, Eva Lyubich, Jonathan Colmer, Kendall Houghton, Mary Munro, Cameron Scalera, Jennifer Withrow

Working Paper Number:

CES-23-20

The natural environment is central to all aspects of life, but efforts to quantify its influence have been hindered by data availability and measurement constraints. To mitigate some of these challenges, we introduce a new prototype of a microdata infras tructure: the Census Environmental Impacts Frame (EIF). The EIF provides detailed individual-level information on demographics, economic characteristics, and address level histories ' linked to spatially and temporally resolved estimates of environmental conditions for each individual ' for almost every resident in the United States over the past two decades. This linked microdata infrastructure provides a unique platform for advancing our understanding about the distribution of environmental amenities and hazards, when, how, and why exposures have evolved over time, and the consequences of environmental inequality and changing environmental conditions. We describe the construction of the EIF, explore issues of coverage and data quality, document patterns and trends in individual exposure to two correlated but distinct air pollutants as an application of the EIF, and discuss implications and opportunities for future research.
View Full Paper PDF
Working Paper
🔥

Income, Wealth, and Environmental Inequality in the United States

October 2024

Authors: Reed Walker, John Voorheis, Jonathan Colmer, Suvy Qin

Working Paper Number:

CES-24-57

This paper explores the relationships between air pollution, income, wealth, and race by combining administrative data from U.S. tax returns between 1979'2016, various measures of air pollution, and sociodemographic information from linked survey and administrative data. In the first year of our data, the relationship between income and ambient pollution levels nationally is approximately zero for both non-Hispanic White and Black individuals. However, at every single percentile of the national income distribution, Black individuals are exposed to, on average, higher levels of pollution than White individuals. By 2016, the relationship between income and air pollution had steepened, primarily for Black individuals, driven by changes in where rich and poor Black individuals live. We utilize quasi-random shocks to income to examine the causal effect of changes in income and wealth on pollution exposure over a five year horizon, finding that these income'pollution elasticities map closely to the values implied by our descriptive patterns. We calculate that Black-White differences in income can explain ~10 percent of the observed gap in air pollution levels in 2016.
View Full Paper PDF
Working Paper

The Census Historical Environmental Impacts Frame

October 2024

Authors: John Voorheis, Eva Lyubich, Kendall Houghton, Mary Munro, Jennifer Withrow, Suvy Qin

Working Paper Number:

CES-24-66

The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
View Full Paper PDF
Working Paper

Longitudinal Environmental Inequality and Environmental Gentrification: Who Gains From Cleaner Air?

May 2017

Authors: John Voorheis

Working Paper Number:

carra-2017-04

A vast empirical literature has convincingly shown that there is pervasive cross-sectional inequality in exposure to environmental hazards. However, less is known about how these inequalities have been evolving over time. I fill this gap by creating a new dataset, which combines satellite data on ground-level concentrations of fine particulate matter with linked administrative and survey data. This linked dataset allows me to measure individual pollution exposure for over 100 million individuals in each year between 2000 and 2014, a period of time has seen substantial improvements in average air quality. This rich dataset can then be used to analyze longitudinal dimensions of environmental inequality by examining the distribution of changes in individual pollution exposure that underlie these aggregate improvements. I confirm previous findings that cross-sectional environmental inequality has been on the decline, but I argue that this may miss longitudinal patterns in exposure that are consistent with environmental gentrification. I find that advantaged individuals at the beginning of the sample experience larger pollution exposure reductions than do initially disadvantaged individuals.
View Full Paper PDF
Working Paper

A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census

August 2025

Authors: Lars Vilhuber, John M. Abowd, Ethan Lewis, Nathan Goldschlag, Michael B. Hawes, Robert Ashmead, Daniel Kifer, Philip Leclerc, Rolando A. Rodríguez, Tamara Adams, David Darais, Sourya Dey, Simson L. Garfinkel, Scott Moore, Ramy N. Tadros

Working Paper Number:

CES-25-57

For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act.
View Full Paper PDF
Working Paper

What Caused Racial Disparities in Particulate Exposure to Fall? New Evidence from the Clean Air Act and Satellite-Based Measures of Air Quality

January 2020

Authors: Janet Currie, Reed Walker, John Voorheis

Working Paper Number:

CES-20-02

Racial differences in exposure to ambient air pollution have declined significantly in the United States over the past 20 years. This project links restricted-access Census Bureau microdata to newly available, spatially continuous high resolution measures of ambient particulate pollution (PM2.5) to examine the underlying causes and consequences of differences in black-white pollution exposures. We begin by decomposing differences in pollution exposure into components explained by observable population characteristics (e.g., income) versus those that remain unexplained. We then use quantile regression methods to show that a significant portion of the 'unexplained' convergence in black-white pollution exposure can be attributed to differential impacts of the Clean Air Act (CAA) in non-Hispanic African American and non-Hispanic white communities. Areas with larger black populations saw greater CAA-related declines in PM2.5 exposure. We show that the CAA has been the single largest contributor to racial convergence in PM2.5 pollution exposure in the U.S. since 2000 accounting for over 60 percent of the reduction.
View Full Paper PDF
Working Paper

Validating Abstract Representations of Spatial Population Data while considering Disclosure Avoidance

February 2020

Authors: James Gaboardi

Working Paper Number:

CES-20-05

This paper furthers a research agenda for modeling populations along spatial networks and expands upon an empirical analysis to a full U.S. county (Gaboardi, 2019, Ch. 1,2). Specific foci are the necessity of, and methods for, validating and benchmarking spatial data when conducting social science research with aggregated and ambiguous population representations. In order to promote the validation of publicly-available data, access to highly-restricted census microdata was requested, and granted, in order to determine the levels of accuracy and error associated with a network-based population modeling framework. Primary findings reinforce the utility of a novel network allocation method'populated polygons to networks (pp2n) in terms of accuracy, computational complexity, and real runtime (Gaboardi, 2019, Ch. 2). Also, a pseudo-benchmark dataset's performance against the true census microdata shows promise in modeling populations along networks.
View Full Paper PDF
Working Paper

Mobility, Opportunity, and Volatility Statistics (MOVS): Infrastructure Files and Public Use Data

April 2024

Authors: Maggie R. Jones, Sonya R. Porter, John Voorheis, Nikolas Pharris-Ciurej, Adam Bee, Jonathan Rothbaum, Kendall Houghton, Amanda Eng

Working Paper Number:

CES-24-23

Federal statistical agencies and policymakers have identified a need for integrated systems of household and personal income statistics. This interest marks a recognition that aggregated measures of income, such as GDP or average income growth, tell an incomplete story that may conceal large gaps in well-being between different types of individuals and families. Until recently, longitudinal income data that are rich enough to calculate detailed income statistics and include demographic characteristics, such as race and ethnicity, have not been available. The Mobility, Opportunity, and Volatility Statistics project (MOVS) fills this gap in comprehensive income statistics. Using linked demographic and tax records on the population of U.S. working-age adults, the MOVS project defines households and calculates household income, applying an equivalence scale to create a personal income concept, and then traces the progress of individuals' incomes over time. We then output a set of intermediate statistics by race-ethnicity group, sex, year, base-year state of residence, and base-year income decile. We select the intermediate statistics most useful in developing more complex intragenerational income mobility measures, such as transition matrices, income growth curves, and variance-based volatility statistics. We provide these intermediate statistics as part of a publicly released data tool with downloadable flat files and accompanying documentation. This paper describes the data build process and the output files, including a brief analysis highlighting the structure and content of our main statistics.
View Full Paper PDF
Working Paper

Consistent Cell Means for Topcoded Incomes in the Public Use March CPS (1976-2007)

March 2008

Authors: Richard Burkhauser, Shuaizhang Feng, Laura Zayatz, Jeff Larrimore

Working Paper Number:

CES-08-06

Using the internal March CPS, we create and in this paper distribute to the larger research community a cell mean series that provides the mean of all income values above the topcode for any income source of any individual in the public use March CPS that has been topcoded since 1976. We also describe our construction of this series. When we use this series together with the public use March CPS, we closely match the yearly mean income levels and income inequalities of the U.S. population found using the internal March CPS data.
View Full Paper PDF
Working Paper

The Changing Nature of Pollution, Income, and Environmental Inequality in the United States

January 2024

Authors: Reed Walker, John Voorheis, Jonathan Colmer, Suvy Qin

Working Paper Number:

CES-24-04

This paper uses administrative tax records linked to Census demographic data and high-resolution measures of fine small particulate (PM2.5) exposure to study the evolution of the Black-White pollution exposure gap over the past 40 years. In doing so, we focus on the various ways in which income may have contributed to these changes using a statistical decomposition. We decompose the overall change in the Black-White PM2.5 exposure gap into (1) components that stem from rank-preserving compression in the overall pollution distribution and (2) changes that stem from a reordering of Black and White households within the pollution distribution. We find a significant narrowing of the Black-White PM2.5 exposure gap over this time period that is overwhelmingly driven by rank-preserving changes rather than positional changes. However, the relative positions of Black and White households at the upper end of the pollution distribution have meaningfully shifted in the most recent years.
View Full Paper PDF

The Privacy-Protected Gridded Environmental Impacts Frame

December 2024

Working Paper Number:

CES-24-74

Abstract

Document Tags and Keywords

The 10 most similar working papers to the working paper 'The Privacy-Protected Gridded Environmental Impacts Frame' are listed below in order of similarity.

April 2023

Working Paper Number:

CES-23-20

October 2024

Working Paper Number:

CES-24-57

October 2024

Working Paper Number:

CES-24-66

May 2017

Working Paper Number:

carra-2017-04

August 2025

Working Paper Number:

CES-25-57

January 2020

Working Paper Number:

CES-20-02

February 2020

Working Paper Number:

CES-20-05

April 2024

Working Paper Number:

CES-24-23

March 2008

Working Paper Number:

CES-08-06

January 2024

Working Paper Number:

CES-24-04