-
The Census Historical Environmental Impacts Frame
October 2024
Working Paper Number:
CES-24-66
The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
View Full
Paper PDF
-
Gradient Boosting to Address Statistical Problems Arising from Non-Linkage of Census Bureau Datasets
June 2024
Working Paper Number:
CES-24-27
This article introduces the twangRDC package, which contains functions to address non-linkage in US Census Bureau datasets. The Census Bureau's Person Identification Validation System facilitates data linkage by assigning unique person identifiers to federal, third party, decennial census, and survey data. Not all records in these datasets can be linked to the reference file and as such not all records will be assigned an identifier. This article is a tutorial for using the twangRDC to generate nonresponse weights to account for non-linkage of person records across US Census Bureau datasets.
View Full
Paper PDF
-
Small Business Pulse Survey Estimates by Owner Characteristics and Rural/Urban Designation
September 2021
Working Paper Number:
CES-21-24
In response to requests from policymakers for additional context for Small Business Pulse Survey (SBPS) measures of the impact of COVID-19 on small businesses, we researched developing estimates by owner characteristics and rural/urban locations. Leveraging geographic coding on the Business Register, we create estimates of the effect of the pandemic on small businesses by urban and rural designations. A more challenging exercise entails linking micro-level data from the SBPS with ownership data from the Annual Business Survey (ABS) to create estimates of the effect of the pandemic on small businesses by owner race, sex, ethnicity, and veteran status. Given important differences in survey design and concerns about nonresponse bias, we face significant challenges in producing estimates for owner demographics. We discuss our attempts to meet these challenges and provide discussion about caution that must be used in interpreting the results. The estimates produced for this paper are available for download. Reflecting the Census Bureau's commitment to scientific inquiry and transparency, the micro data from the SBPS will be available to qualified researchers on approved projects in the Federal Statistical Research Data Center network.
View Full
Paper PDF
-
Determination of the 2020 U.S. Citizen Voting Age Population (CVAP) Using Administrative Records and Statistical Methodology Technical Report
October 2020
Authors:
John M. Abowd,
J. David Brown,
Lawrence Warren,
Moises Yi,
Misty L. Heggeness,
William R. Bell,
Michael B. Hawes,
Andrew Keller,
Vincent T. Mule Jr.,
Joseph L. Schafer,
Matthew Spence
Working Paper Number:
CES-20-33
This report documents the efforts of the Census Bureau's Citizen Voting-Age Population (CVAP) Internal Expert Panel (IEP) and Technical Working Group (TWG) toward the use of multiple data sources to produce block-level statistics on the citizen voting-age population for use in enforcing the Voting Rights Act. It describes the administrative, survey, and census data sources used, and the four approaches developed for combining these data to produce CVAP estimates. It also discusses other aspects of the estimation process, including how records were linked across the multiple data sources, and the measures taken to protect the confidentiality of the data.
View Full
Paper PDF
-
Using Linked Data to Investigate True Intergenerational Change: Three Generations Over Seven Decades
August 2018
Working Paper Number:
carra-2018-09
It is widely thought that immigrants and their families undergo profound cultural and socioeconomic changes as a consequence of coming into contact with U.S. society, but the way this occurs remains unclear and controversial due in large part to data limitations. In this paper, we provide proof of concept for analyses using linked data that allow us to compare outcomes across more 'exact' family generations. Specifically, we are able to follow immigrant parents and their children and grandchildren across seven decades using census and survey data from 1940 to 2014. We describe the data and linkage methodology, evaluate the representativeness of the linked sample, test a method for adjusting for biases that arise from non-representative linkages, and describe the size, diversity, and socioeconomic characteristics of the linked sample. We demonstrate that large sample sizes of linked data will likely permit us to compare several national origin groups across multiple generations.
View Full
Paper PDF
-
Understanding the Quality of Alternative Citizenship Data Sources for the 2020 Census
August 2018
Working Paper Number:
CES-18-38R
This paper examines the quality of citizenship data in self-reported survey responses compared to administrative records and evaluates options for constructing an accurate count of resident U.S. citizens. Person-level discrepancies between survey-collected citizenship data and administrative records are more pervasive than previously reported in studies comparing survey and administrative data aggregates. Our results imply that survey-sourced citizenship data produce significantly lower estimates of the noncitizen share of the population than would be produced from currently available administrative records; both the survey-sourced and administrative data have shortcomings that could contribute to this difference. Our evidence is consistent with noncitizen respondents misreporting their own citizenship status and failing to report that of other household members. At the same time, currently available administrative records may miss some naturalizations and capture others with a delay. The evidence in this paper also suggests that adding a citizenship question to the 2020 Census would lead to lower self-response rates in households potentially containing noncitizens, resulting in higher fieldwork costs and a lower-quality population count.
View Full
Paper PDF
-
The Opportunities and Challenges of Linked IRS Administrative and Census Survey Records in the Study of Migration
July 2018
Working Paper Number:
carra-2018-06
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations that are hard to reach with traditional Census surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential applications of our longitudinal IRS dataset to innovations in migration research in the United States.
View Full
Paper PDF
-
The Use of Administrative Records and the American Community Survey to Study the Characteristics of Undercounted Young Children in the 2010 Census
May 2018
Working Paper Number:
carra-2018-05
Children under age five are historically one of the most difficult segments of the population to enumerate in the U.S. decennial census. The persistent undercount of young children is highest among Hispanics and racial minorities. In this study, we link 2010 Census data to administrative records from government and third party data sources, such as Medicaid enrollment data and tenant rental assistance program records from the Department of Housing and Urban Development, to identify differences between children reported and not reported in the 2010 Census. In addition, we link children in administrative records to the American Community Survey to identify various characteristics of households with children under age five who may have been missed in the last census. This research contributes to what is known about the demographic, socioeconomic, and household characteristics of young children undercounted by the census. Our research also informs the potential benefits of using administrative records and surveys to supplement the U.S. Census Bureau child population enumeration efforts in future decennial censuses.
View Full
Paper PDF
-
Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the U.S. Statistical System?
January 2017
Authors:
Lars Vilhuber,
John M. Abowd,
Daniel Weinberg,
Jerome P. Reiter,
Matthew D. Shapiro,
Robert F. Belli,
Noel Cressie,
David C. Folch,
Scott H. Holan,
Margaret C. Levenstein,
Kristen M. Olson,
Jolene Smyth,
Leen-Kiat Soh,
Bruce D. Spencer,
Seth E. Spielman,
Christopher K. Wikle
Working Paper Number:
CES-17-59R
The National Science Foundation-Census Bureau Research Network (NCRN) was established in 2011 to create interdisciplinary research nodes on methodological questions of interest and significance to the broader research community and to the Federal Statistical System (FSS), particularly the Census Bureau. The activities to date have covered both fundamental and applied statistical research and have focused at least in part on the training of current and future generations of researchers in skills of relevance to surveys and alternative measurement of economic units, households, and persons. This paper discusses some of the key research findings of the eight nodes, organized into six topics: (1) Improving census and survey data collection methods; (2) Using alternative sources of data; (3) Protecting privacy and confidentiality by improving disclosure avoidance; (4) Using spatial and spatio-temporal statistical modeling to improve estimates; (5) Assessing data cost and quality tradeoffs; and (6) Combining information from multiple sources. It also reports on collaborations across nodes and with federal agencies, new software developed, and educational activities and outcomes. The paper concludes with an evaluation of the ability of the FSS to apply the NCRN's research outcomes and suggests some next steps, as well as the implications of this research-network model for future federal government renewal initiatives.
View Full
Paper PDF
-
Decennial Census Return Rates: The Role of Social Capital
January 2017
Working Paper Number:
CES-17-39
This paper explores how useful information about social and civic engagement (social capital)
might be to the U.S. Census Bureau in their efforts to improve predictions of mail return rates for the Decennial Census (DC) at the census tract level. Through construction of Hard-to-count (HRC) scores and multivariate analysis, we find that if information about social capital were available, predictions of response rates would be marginally improved.
View Full
Paper PDF