-
Peer Income Exposure Across the Income Distribution
February 2025
Working Paper Number:
CES-25-16
Children from families across the income distribution attend public schools, making schools and classrooms potential sites for interaction between more- and less-affluent children. However, limited information exists regarding the extent of economic integration in these contexts. We merge educational administrative data from Oregon with measures of family income derived from IRS records to document student exposure to economically diverse school and classroom peers. Our findings indicate that affluent children in public schools are relatively isolated from their less affluent peers, while low- and middle-income students experience relatively even peer income distributions. Students from families in the top percentile of the income distribution attend schools where 20 percent of their peers, on average, come from the top five income percentiles. A large majority of the differences in peer exposure that we observe arise from the sorting of students across schools; sorting across classrooms within schools plays a substantially smaller role.
View Full
Paper PDF
-
Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children
January 2025
Working Paper Number:
CES-25-03
Researchers and practitioners increasingly rely on administrative data sources to measure family income. However, administrative data sources are often incomplete in their coverage of the population, giving rise to potential bias in family income measures, particularly if coverage deficiencies are not well understood. We focus on the school-aged child population, due to its particular import to research and policy, and because of the unique challenges of linking children to family income information. We find that two of the most significant administrative sources of family income information that permit linking of children and parents'IRS Form 1040 and SNAP participation records'usefully complement each other, potentially reducing coverage bias when used together. In a case study considering how best to measure economic disadvantage rates in the public school student population, we demonstrate the sensitivity of family income statistics to assumptions about individuals who do not appear in administrative data sources.
View Full
Paper PDF
-
The Census Historical Environmental Impacts Frame
October 2024
Working Paper Number:
CES-24-66
The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
View Full
Paper PDF
-
Comparison of Child Reporting in the American Community Survey and Federal Income Tax Returns Based on California Birth Records
September 2024
Working Paper Number:
CES-24-55
This paper takes advantage of administrative records from California, a state with a large child population and a significant historical undercount of children in Census Bureau data, dependent information in the Internal Revenue Service (IRS) Form 1040 records, and the American Community Survey to characterize undercounted children and compare child reporting. While IRS Form 1040 records offer potential utility for adjusting child undercounting in Census Bureau surveys, this analysis finds overlapping reporting issues among various demographic and economic groups. Specifically, older children, those of Non-Hispanic Black mothers and Hispanic mothers, children or parents with lower English proficiency, children whose mothers did not complete high school, and families with lower income-to-poverty ratio were less frequently reported in IRS 1040 records than other groups. Therefore, using IRS 1040 dependent records may have limitations for accurately representing populations with characteristics associated with the undercount of children in surveys.
View Full
Paper PDF
-
Household Wealth and Entrepreneurial Career Choices: Evidence from Climate Disasters
July 2024
Working Paper Number:
CES-24-39
This study investigates how household wealth affects the human capital of startups, based on U.S. Census individual-level employment data, deed records, and geographic information system (GIS) data. Using floods as a wealth shock, a regression discontinuity analysis shows inundated residents are 7% less likely to work in startups relative to their neighbors outside the flood boundary, within a 0.1-mile-wide band. The effect is more pronounced for homeowners, consistent with the wealth effect. The career distortion leads to a significant long-run income loss, highlighting the importance of self-insurance for human capital allocation.
View Full
Paper PDF
-
Measuring Income of the Aged in Household Surveys: Evidence from Linked Administrative Records
June 2024
Working Paper Number:
CES-24-32
Research has shown that household survey estimates of retirement income (defined benefit pensions and defined contribution account withdrawals) suffer from substantial underreporting which biases downward measures of financial well-being among the aged. Using data from both the redesigned 2016 Current Population Survey Annual Social and Economic Supplement (CPS ASEC) and the Health and Retirement Study (HRS), each matched with administrative records, we examine to what extent underreporting of retirement income affects key statistics such as reliance on Social Security benefits and poverty among the aged. We find that underreporting of retirement income is still prevalent in the CPS ASEC. While the HRS does a better job than the CPS ASEC in terms of capturing retirement income, it still falls considerably short compared to administrative records. Consequently, the relative importance of Social Security income remains overstated in household surveys'53 percent of elderly beneficiaries in the CPS ASEC and 49 percent in the HRS rely on Social Security for the majority of their incomes compared to 42 percent in the linked administrative data. The poverty rate for those aged 65 and over is also overstated'8.8 percent in the CPS ASEC and 7.4 percent in the HRS compared to 6.4 percent in the linked administrative data. Our results illustrate the effects of using alternative data sources in producing key statistics from the Social Security Administration's Income of the Aged publication.
View Full
Paper PDF
-
Revisiting Methods to Assign Responses when Race and Hispanic Origin Reporting are Discrepant Across Administrative Records and Third Party Sources
May 2024
Working Paper Number:
CES-24-26
The Best Race and Ethnicity Administrative Records Composite file ('Best Race file') is an composite file which combines Census, federal, and Third Party Data (TPD) sources and applies business rules to assign race and ethnicity values to person records. The first version of the Best Race administrative records composite was first constructed in 2015 and subsequently updated each year to include more recent vintages, when available, of the data sources originally included in the composite file. Where updates were available for data sources, the most recent information for persons was retained, and the business rules were reapplied to assign a single race and single Hispanic origin value to each person record. The majority of person records on the Best Race file have consistent race and ethnicity information across data sources. Where there are discrepancies in responses across data sources, we apply a series of business rules to assign a single race and ethnicity to each record. To improve the quality of the Best Race administrative records composite, we have begun revising the business rules which were developed several years ago. This paper discusses the original business rules as well as the implemented changes and their impact on the composite file.
View Full
Paper PDF
-
Mobility, Opportunity, and Volatility Statistics (MOVS):
Infrastructure Files and Public Use Data
April 2024
Working Paper Number:
CES-24-23
Federal statistical agencies and policymakers have identified a need for integrated systems of household and personal income statistics. This interest marks a recognition that aggregated measures of income, such as GDP or average income growth, tell an incomplete story that may conceal large gaps in well-being between different types of individuals and families. Until recently, longitudinal income data that are rich enough to calculate detailed income statistics and include demographic characteristics, such as race and ethnicity, have not been available. The Mobility, Opportunity, and Volatility Statistics project (MOVS) fills this gap in comprehensive income statistics. Using linked demographic and tax records on the population of U.S. working-age adults, the MOVS project defines households and calculates household income, applying an equivalence scale to create a personal income concept, and then traces the progress of individuals' incomes over time. We then output a set of intermediate statistics by race-ethnicity group, sex, year, base-year state of residence, and base-year income decile. We select the intermediate statistics most useful in developing more complex intragenerational income mobility measures, such as transition matrices, income growth curves, and variance-based volatility statistics. We provide these intermediate statistics as part of a publicly released data tool with downloadable flat files and accompanying documentation. This paper describes the data build process and the output files, including a brief analysis highlighting the structure and content of our main statistics.
View Full
Paper PDF
-
The Long-Term Effects of Income for At-Risk Infants: Evidence from Supplemental Security Income
March 2024
Working Paper Number:
CES-24-10
This paper examines whether a generous cash intervention early in life can "undo" some of the long-term disadvantage associated with poor health at birth. We use new linkages between several large-scale administrative datasets to examine the short-, medium-, and long-term effects of providing low-income families with low birthweight infants support through the Supplemental Security Income (SSI) program. This program uses a birthweight cutoff at 1200 grams to determine eligibility. We find that families of infants born just below this cutoff experience a large increase in cash benefits totaling about 27%of family income in the first three years of the infant's life. These cash benefits persist at lower amounts through age 10. Eligible infants also experience a small but statistically significant increase in Medicaid enrollment during childhood. We examine whether this support affects health care use and mortality in infancy, educational performance in high school, post-secondary school attendance and college degree attainment, and earnings, public assistance use, and mortality in young adulthood for all infants born in California to low-income families whose birthweight puts them near the cutoff. We also examine whether these payments had spillover effects onto the older siblings of these infants who may have also benefited from the increase in family resources. Despite the comprehensive nature of this early life intervention, we detect no improvements in any of the study outcomes, nor do we find improvements among the older siblings of these infants. These null effects persist across several subgroups and alternative model specifications, and, for some outcomes, our estimates are precise enough to rule out published estimates of the effect of early life cash transfers in other settings.
View Full
Paper PDF
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report
December 2023
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. Rodríguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-23-63R
For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act. You are reading the full technical report. For the summary paper see https://doi.org/10.1162/99608f92.4a1ebf70.
View Full
Paper PDF