-
Using Linked Data to Investigate True Intergenerational Change: Three Generations Over Seven Decades
August 2018
Working Paper Number:
carra-2018-09
It is widely thought that immigrants and their families undergo profound cultural and socioeconomic changes as a consequence of coming into contact with U.S. society, but the way this occurs remains unclear and controversial due in large part to data limitations. In this paper, we provide proof of concept for analyses using linked data that allow us to compare outcomes across more 'exact' family generations. Specifically, we are able to follow immigrant parents and their children and grandchildren across seven decades using census and survey data from 1940 to 2014. We describe the data and linkage methodology, evaluate the representativeness of the linked sample, test a method for adjusting for biases that arise from non-representative linkages, and describe the size, diversity, and socioeconomic characteristics of the linked sample. We demonstrate that large sample sizes of linked data will likely permit us to compare several national origin groups across multiple generations.
View Full
Paper PDF
-
Foreign-Born and Native-Born Migration in the U.S.: Evidence from IRS Administrative and Census Survey Records
July 2018
Working Paper Number:
carra-2018-07
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration among foreign-born and native-born populations in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses for native-born and foreign-born respondents in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations, such as the foreign-born, that may be difficult to reach with traditional Census Bureau surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential application of our longitudinal IRS dataset to innovations in migration research on both the native-born and foreign-born populations of the United States.
View Full
Paper PDF
-
The Opportunities and Challenges of Linked IRS Administrative and Census Survey Records in the Study of Migration
July 2018
Working Paper Number:
carra-2018-06
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations that are hard to reach with traditional Census surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential applications of our longitudinal IRS dataset to innovations in migration research in the United States.
View Full
Paper PDF
-
The Use of Administrative Records and the American Community Survey to Study the Characteristics of Undercounted Young Children in the 2010 Census
May 2018
Working Paper Number:
carra-2018-05
Children under age five are historically one of the most difficult segments of the population to enumerate in the U.S. decennial census. The persistent undercount of young children is highest among Hispanics and racial minorities. In this study, we link 2010 Census data to administrative records from government and third party data sources, such as Medicaid enrollment data and tenant rental assistance program records from the Department of Housing and Urban Development, to identify differences between children reported and not reported in the 2010 Census. In addition, we link children in administrative records to the American Community Survey to identify various characteristics of households with children under age five who may have been missed in the last census. This research contributes to what is known about the demographic, socioeconomic, and household characteristics of young children undercounted by the census. Our research also informs the potential benefits of using administrative records and surveys to supplement the U.S. Census Bureau child population enumeration efforts in future decennial censuses.
View Full
Paper PDF
-
Disclosure Limitation and Confidentiality Protection in Linked Data
January 2018
Working Paper Number:
CES-18-07
Confidentiality protection for linked administrative data is a combination of access modalities and statistical disclosure limitation. We review traditional statistical disclosure limitation methods and newer methods based on synthetic data, input noise infusion and formal privacy. We discuss how these methods are integrated with access modalities by providing three detailed examples. The first example is the linkages in the Health and Retirement Study to Social Security Administration data. The second example is the linkage of the Survey of Income and Program Participation to administrative data from the Internal Revenue Service and the Social Security Administration. The third example is the Longitudinal Employer-Household Dynamics data, which links state unemployment insurance records for workers and firms to a wide variety of censuses and surveys at the U.S. Census Bureau. For examples, we discuss access modalities, disclosure limitation methods, the effectiveness of those methods, and the resulting analytical validity. The final sections discuss recent advances in access modalities for linked administrative data.
View Full
Paper PDF
-
Just Passing Through: Characterizing U.S. Pass-Through Business Owners
January 2017
Working Paper Number:
CES-17-69
We investigate the use of administrative data on the owners of partnerships and S-corporations to develop new statistics that characterize business owners. Income from these types of entities is "passed through" to owners to be taxed on the owners' tax returns. The information returns associated with such pass-through entities (Form K1 records) make it possible to link individual owners to the businesses they own. These linkages can be leveraged to associate measures of the demographic and human capital characteristics of business owners with the characteristics of the businesses they own. This paper describes measurement issues associated with administrative records on these pass-through entities and their integration with other Census data products. In addition, we document a number of interesting trends in business ownership among pass-through entities. We show a substantial decline in both entry and exit with less churn among both owners and owned businesses. We also show that the owners of pass-through entities are older, more likely to be male, and more likely to be white compared to the working population.
View Full
Paper PDF
-
Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the U.S. Statistical System?
January 2017
Authors:
Lars Vilhuber,
John M. Abowd,
Daniel Weinberg,
Jerome P. Reiter,
Matthew D. Shapiro,
Robert F. Belli,
Noel Cressie,
David C. Folch,
Scott H. Holan,
Margaret C. Levenstein,
Kristen M. Olson,
Jolene Smyth,
Leen-Kiat Soh,
Bruce D. Spencer,
Seth E. Spielman,
Christopher K. Wikle
Working Paper Number:
CES-17-59R
The National Science Foundation-Census Bureau Research Network (NCRN) was established in 2011 to create interdisciplinary research nodes on methodological questions of interest and significance to the broader research community and to the Federal Statistical System (FSS), particularly the Census Bureau. The activities to date have covered both fundamental and applied statistical research and have focused at least in part on the training of current and future generations of researchers in skills of relevance to surveys and alternative measurement of economic units, households, and persons. This paper discusses some of the key research findings of the eight nodes, organized into six topics: (1) Improving census and survey data collection methods; (2) Using alternative sources of data; (3) Protecting privacy and confidentiality by improving disclosure avoidance; (4) Using spatial and spatio-temporal statistical modeling to improve estimates; (5) Assessing data cost and quality tradeoffs; and (6) Combining information from multiple sources. It also reports on collaborations across nodes and with federal agencies, new software developed, and educational activities and outcomes. The paper concludes with an evaluation of the ability of the FSS to apply the NCRN's research outcomes and suggests some next steps, as well as the implications of this research-network model for future federal government renewal initiatives.
View Full
Paper PDF
-
A Comparison of Training Modules for Administrative Records Use in Nonresponse Followup Operations: The 2010 Census and the American Community Survey
January 2017
Working Paper Number:
CES-17-47
While modeling work in preparation for the 2020 Census has shown that administrative records can be predictive of Nonresponse Followup (NRFU) enumeration outcomes, there is scope to examine the robustness of the models by using more recent training data. The models deployed for workload removal from the 2015 and 2016 Census Tests were based on associations of the 2010 Census with administrative records. Training the same models with more recent data from the American Community Survey (ACS) can identify any changes in parameter associations over time that might reduce the accuracy of model predictions. Furthermore, more recent training data would allow for the
incorporation of new administrative record sources not available in 2010. However, differences in ACS methodology and the smaller sample size may limit its applicability. This paper replicates earlier results and examines model predictions based on the ACS in comparison with NRFU outcomes. The evaluation
consists of a comparison of predicted counts and household compositions with actual 2015 NRFU outcomes. The main findings are an overall validation of the methodology using independent data.
View Full
Paper PDF
-
Local Labor Demand and Program Participation Dynamics
November 2016
Working Paper Number:
carra-2016-10
Estimates the effect of fluctuations in local labor conditions on the likelihood that existing participants are able to transition out of the Supplemental Nutrition Assistance Program (SNAP). Our primary data are SNAP administrative records from New York (2007-2012) linked to the 2010 Census at the person-level. We further augment these data by linking to industry-specific labor market indicators at the county-level. We find that local labor markets matter for the length of time individuals spend on SNAP, but there is substantial heterogeneity in estimated effects across local industries. While employment growth in industries with small shares of SNAP participants has no impact on SNAP exits, growth in local industries with creases the likelihood that recipients exit the program. We also observe corresponding increases in entries when these industries experience localized contractions. Notably, estimated industry effects vary across race groups and parental status, with Black Alone non-Hispanic, Hispanic, and mothers benefiting the least from improvements in local labor market conditions.
View Full
Paper PDF
-
Playing with Matches: An Assessment of Accuracy in Linked Historical Data
June 2016
Working Paper Number:
carra-2016-05
This paper evaluates linkage quality achieved by various record linkage techniques used in historical demography. I create benchmark, or truth, data by linking the 2005 Current Population Survey Annual Social and Economic Supplement to the Social Security Administration's Numeric Identification System by Social Security Number. By comparing simulated linkages to the benchmark data, I examine the value added (in terms of number and quality of links) from incorporating text-string comparators, adjusting age, and using a probabilistic matching algorithm. I find that text-string comparators and probabilistic approaches are useful for increasing the linkage rate, but use of text-string comparators may decrease accuracy in some cases. Overall, probabilistic matching offers the best balance between linkage rates and accuracy.
View Full
Paper PDF