This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations that are hard to reach with traditional Census surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential applications of our longitudinal IRS dataset to innovations in migration research in the United States.
Foreign-Born and Native-Born Migration in the U.S.: Evidence from IRS Administrative and Census Survey Records
July 2018
Working Paper Number:
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration among foreign-born and native-born populations in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses for native-born and foreign-born respondents in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations, such as the foreign-born, that may be difficult to reach with traditional Census Bureau surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential application of our longitudinal IRS dataset to innovations in migration research on both the native-born and foreign-born populations of the United States.
View Full
Paper PDF
Internal Migration in the U.S. During the COVID-19 Pandemic
September 2024
Working Paper Number:
Survey and administrative internal migration data disagree on whether the COVID-19 pandemic increased or decreased mobility in the U.S. Moreover, though scholars have theorized and documented migration in response to environmental hazards and economic shocks, the novel conditions posed by a global pandemic make it difficult to hypothesize whether and how American migration might change as a result. We link individual-level data from the United States Postal Service's National Change of Address (NCOA) registry to American Community Survey (ACS) and Current Population Survey (CPS-ASEC) responses and other administrative records to document changes in the level, geography, and composition of migrant flows between 2019 and 2021. We find a 2% increase in address changes between 2019 and 2020, representing an additional 603,000 moves, driven primarily by young adults, earners at the extremes of the income distribution, and individuals (as opposed to families) moving over longer distances. Though the number of address changes returned to pre-pandemic levels in 2021, the pandemic-era geographic and compositional shifts in favor of longer distance moves away from the Pacific and Mid-Atlantic regions toward the South and in favor of younger, individual movers persisted. We also show that at least part of the disconnect between survey, media, and administrative/third-party migration data sources stems from the apparent misreporting of address changes on Census Bureau surveys. Among ACS and CPS-ASEC householders linked to NCOA data and filing a permanent change of address in their 1-year survey response reference period, only around 68% of ACS and 49% of CPS-ASEC householders also reported living in a different residence one year ago in their survey response.
View Full
Paper PDF
The Census Historical Environmental Impacts Frame
October 2024
Working Paper Number:
The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
View Full
Paper PDF
Age, Sex, and Racial/Ethnic Disparities and Temporal-Spatial Variation in
Excess All-Cause Mortality During the COVID-19 Pandemic: Evidence from Linked Administrative and Census Bureau Data
May 2022
Working Paper Number:
Research on the impact of the COVID-19 pandemic in the United States has highlighted substantial racial/ethnic disparities in excess mortality, but reports often differ in the details with respect to the size of these disparities. We suggest that these inconsistencies stem from differences in the temporal scope and measurement of race/ethnicity in existing data. We address these issues using death records for 2010 through 2021 from the Social Security Administration, covering the universe of individuals ever issued a Social Security Number, linked to race/ethnicity responses from the decennial census and American Community Survey. We use these data to (1) estimate excess all-cause mortality at the national level and for age-, sex-, and race/ethnicity-specific subgroups, (2) examine racial/ethnic variation in excess mortality over the course of the pandemic, and (3) explore whether and how racial/ethnic mortality disparities vary across states.
View Full
Paper PDF
Exploring Administrative Records Use for Race and Hispanic Origin Item Non-Response
December 2014
Working Paper Number:
Race and Hispanic origin data are required to produce official statistics in the United States. Data collected through the American Community Survey and decennial census address missing data through traditional imputation methods, often relying on information from neighbors. These methods work well if neighbors share similar characteristics, however, the shape and patterns of neighborhoods in the United States are changing. Administrative records may provide more accurate data compared to traditional imputation methods for missing race and Hispanic origin responses. This paper first describes the characteristics of persons with missing demographic data, then assesses the coverage of administrative records data for respondents who do not answer race and Hispanic origin questions in Census data. The paper also discusses the distributional impact of using administrative records race and Hispanic origin data to complete missing responses in a decennial census or survey context.
View Full
Paper PDF
Granular Income Inequality and Mobility using IDDA: Exploring Patterns across Race and Ethnicity
November 2023
Working Paper Number:
Shifting earnings inequality among U.S. workers over the last five decades has been widely stud ied, but understanding how these shifts evolve across smaller groups has been difficult. Publicly available data sources typically only ensure representative data at high levels of aggregation, so they obscure many details of earnings distributions for smaller populations. We define and construct a set of granular statistics describing income distributions, income mobility and con ditional income growth for a large number of subnational groups in the U.S. for a two-decade period (1998-2019). In this paper, we use the resulting data to explore the evolution of income inequality and mobility for detailed groups defined by race and ethnicity. We find that patterns identified from the universe of tax filers and W-2 recipients that we observe differ in important ways from those that one might identify in public sources. The full set of statistics that we construct is available publicly as the Income Distributions and Dynamics in America, or IDDA, data set.
View Full
Paper PDF
Assimilation and Coverage of the
Foreign-Born Population in Administrative Records
April 2015
Working Paper Number:
The U.S. Census Bureau is researching ways to incorporate administrative data in decennial census and survey operations. Critical to this work is an understanding of the coverage of the population by administrative records. Using federal and third party administrative data linked to the American Community Survey (ACS), we evaluate the extent to which administrative records provide data on foreign-born individuals in the ACS and employ multinomial logistic regression techniques to evaluate characteristics of those who are in administrative records relative to those who are not. We find that overall, administrative records provide high coverage of foreign-born individuals in our sample for whom a match can be determined. The odds of being in administrative records are found to be tied to the processes of immigrant assimilation - naturalization, higher English proficiency, educational attainment, and full-time employment are associated with greater odds of being in administrative records. These findings suggest that as immigrants adapt and integrate into U.S. society, they are more likely to be involved in government and commercial processes and programs for which we are including data. We further explore administrative records coverage for the two largest race/ethnic groups in our sample - Hispanic and non-Hispanic single-race Asian foreign born, finding again that characteristics related to assimilation are associated with administrative records coverage for both groups. However, we observe that neighborhood context impacts Hispanics and Asians differently.
View Full
Paper PDF
Dynamics of Race: Joining, Leaving, and Staying in the American Indian/Alaska Native Race Category between 2000 and 2010
August 2014
Working Paper Number:
Each census for decades has seen the American Indian and Alaska Native population increase substantially more than expected. Changes in racial reporting seem to play an important role in the observed net increases, though research has been hampered by data limitations. We address previously unanswerable questions about race response change among American Indian and Alaska Natives (hereafter 'American Indians') using uniquely-suited (but not nationally representative) linked data from the 2000 and 2010 decennial censuses (N = 3.1 million) and the 2006-2010 American Community Survey (N = 188,131). To what extent do people change responses to include or exclude American Indian? How are people who change responses similar to or different from those who do not? How are people who join a group similar to or different from those who leave it? We find considerable race response change by people in our data, especially by multiple-race and/or Hispanic American Indians. This turnover is hidden in cross-sectional comparisons because people joining the group are similar in number and characteristics to those who leave the group. People in our data who changed their race response to add or drop American Indian differ from those who kept the same race response in 2000 and 2010 and from those who moved between a single-race and multiple-race American Indian response. Those who consistently reported American Indian (including those who added or dropped another race response) were relatively likely to report a tribe, live in an American Indian area, report American Indian ancestry, and live in the West. There are significant differences between those who joined and those who left a specific American Indian response group, but poor model fit indicates general similarity between joiners and leavers. Response changes should be considered when conceptualizing and operationalizing 'the American Indian and Alaska Native population.'
View Full
Paper PDF
Evaluating Race and Hispanic Origin Responses of Medicaid Participants Using Census Data
April 2015
Working Paper Number:
Health and health care disparities associated with race or Hispanic origin are complex and continue to challenge researchers and policy makers. With the intention of improving the measurement and monitoring of these disparities, provisions of the Patient Protection and Affordable Care Act (ACA) of 2010 require states to collect, report and analyze data on demographic characteristics of applicants and participants in Medicaid and other federally supported programs. By linking Medicaid records to 2010 Census, American Community Survey, and Census 2000, this new large-scale study examines and documents the extent to which pre-ACA Medicaid administrative records match self-reported race and Hispanic origin in Census data. Linked records allow comparisons between individuals with matching and non-matching race and Hispanic origin data across several demographic, socioeconomic and neighborhood characteristics, such as age, gender, language proficiency, education and Census tract variables. Identification of the groups most likely to have non-matching and missing race and Hispanic origin data in Medicaid relative to Census data can inform strategies to improve the quality of demographic data collected from Medicaid populations.
View Full
Paper PDF
February 2014
Working Paper Number:
Determining whether population dynamics provide competing explanations to place effects for observed geographic patterns of population health is critical for understanding health inequality. We focus on the working-age population where health disparities are greatest and analyze detailed data on residential mobility collected for the first time in the 2000 US census. Residential mobility over a 5-year period is frequent and selective, with some variation by race and gender. Even so, we find little evidence that mobility biases cross-sectional snapshots of local population health. Areas undergoing large or rapid population growth or decline may be exceptions. Overall, place of residence is an important health indicator; yet, the frequency of residential mobility raises questions of interpretation from etiological or policy perspectives, complicating simple understandings that residential exposures alone explain the association between place and health. Psychosocial stressors related to contingencies of social identity associated with being black, urban, or poor in the U.S. may also have adverse health impacts that track with structural location even with movement across residential areas.
View Full
Paper PDF