-
Applying Current Core Based Statistical Area Standards to Historical Census Data, 1940-2020
January 2025
Working Paper Number:
CES-25-10
In the middle of the twentieth century, the Bureau of the Budget, in conjunction with the Census Bureau and other federal statistical agencies, introduced a widely used unit of statistical geography, the county-based Standard Metropolitan Area. Metropolitan definitions since then have been generally regarded as comparable, but methodological changes have resulted in comparability issues, particularly among the largest and most complex metro areas. With the 2000 census came an effort to simplify the rules for defining metro areas. This study attempts to gather all available historical geographic and commuting data to apply the current rules for defining metro areas to create comparable statistical geography covering the period from 1940 to 2020. The changes that accompanied the 2000 census also brought a new category, "Micropolitan Statistical Areas," which established a metro hierarchy. This research expands on this approach, using a more elaborate hierarchy based on the size of urban cores. The areas as delineated in this paper provide a consistent set of statistical geography that can be used in a wide variety of applications.
View Full
Paper PDF
-
Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey
January 2024
Working Paper Number:
CES-24-02
Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations.
After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics.
This paper is for research purposes only. No changes to production are being implemented at this time.
View Full
Paper PDF
-
Methodology on Creating the U.S. Linked Retail Health Clinic (LiRHC) Database
March 2023
Working Paper Number:
CES-23-10
Retail health clinics (RHCs) are a relatively new type of health care setting and understanding the role they play as a source of ambulatory care in the United States is important. To better understand these settings, a joint project by the Census Bureau and National Center for Health Statistics used data science techniques to link together data on RHCs from Convenient Care Association, County Business Patterns Business Register, and National Plan and Provider Enumeration System to create the Linked RHC (LiRHC, pronounced 'lyric') database of locations throughout the United States during the years 2018 to 2020. The matching methodology used to perform this linkage is described, as well as the benchmarking, match statistics, and manual review and quality checks used to assess the resulting matched data. The large majority (81%) of matches received quality scores at or above 75/100, and most matches were linked in the first two (of eight) matching passes, indicating high confidence in the final linked dataset. The LiRHC database contained 2,000 RHCs and found that 97% of these clinics were in metropolitan statistical areas and 950 were in the South region of the United States. Through this collaborative effort, the Census Bureau and National Center for Health Statistics strive to understand how RHCs can potentially impact population health as well as the access and provision of health care services across the nation.
View Full
Paper PDF
-
Some Open Questions on Multiple-Source Extensions of Adaptive-Survey Design Concepts and Methods
February 2023
Working Paper Number:
CES-23-03
Adaptive survey design is a framework for making data-driven decisions about survey data collection operations. This paper discusses open questions related to the extension of adaptive principles and capabilities when capturing data from multiple data sources. Here, the concept of 'design' encompasses the focused allocation of resources required for the production of high-quality statistical information in a sustainable and cost-effective way. This conceptual framework leads to a discussion of six groups of issues including: (i) the goals for improvement through adaptation; (ii) the design features that are available for adaptation; (iii) the auxiliary data that may be available for informing adaptation; (iv) the decision rules that could guide adaptation; (v) the necessary systems to operationalize adaptation; and (vi) the quality, cost, and risk profiles of the proposed adaptations (and how to evaluate them). A multiple data source environment creates significant opportunities, but also introduces complexities that are a challenge in the production of high-quality statistical information.
View Full
Paper PDF
-
LEHD Snapshot Documentation, Release S2021_R2022Q4
November 2022
Working Paper Number:
CES-22-51
The Longitudinal Employer-Household Dynamics (LEHD) data at the U.S. Census Bureau is a quarterly database of linked employer-employee data covering over 95% of employment in the United States. These data are used to produce a number of public-use tabulations and tools, including the Quarterly Workforce Indicators (QWI), LEHD Origin-Destination Employment Statistics (LODES), Job-to-Job Flows (J2J), and Post-Secondary Employment Outcomes (PSEO) data products. Researchers on approved projects may also access the underlying LEHD microdata directly, in the form of the LEHD Snapshot restricted-use data product. This document provides a detailed overview of the LEHD Snapshot as of release S2021_R2022Q4, including user guidance, variable codebooks, and an overview of the approvals needed to obtain access. Updates to the documentation for this and future snapshot releases will be made available in HTML format on the LEHD website.
View Full
Paper PDF
-
Using Small-Area Estimation (SAE) to Estimate Prevalence of Child Health Outcomes at the Census Regional-, State-, and County-Levels
November 2022
Working Paper Number:
CES-22-48
In this study, we implement small-area estimation to assess the prevalence of child health outcomes at the county, state, and regional levels, using national survey data.
View Full
Paper PDF
-
Multinational Firms in the U.S. Economy: Insights from Newly Integrated Microdata
September 2022
Working Paper Number:
CES-22-39
This paper describes the construction of two confidential crosswalk files enabling a comprehensive identification of multinational rms in the U.S. economy. The effort combines firm-level surveys on direct investment conducted by the U.S. Bureau of Economic Analysis (BEA) and the U.S. Census Bureau's Business Register (BR) spanning the universe of employer businesses from 1997 to 2017. First, the parent crosswalk links BEA firm-level surveys on U.S. direct investment abroad and the BR. Second, the affiliate crosswalk links BEA firm-level surveys on foreign direct investment in the United States and the BR. Using these newly available links, we distinguish between U.S.- and foreign-owned multinational firms and describe their prevalence and economic activities in the national economy, by sector, and by geography.
View Full
Paper PDF
-
Improving Patent Assignee-Firm Bridge with Web Search Results
August 2022
Working Paper Number:
CES-22-31
This paper constructs a patent assignee-firm longitudinal bridge between U.S. patent assignees and firms using firm-level administrative data from the U.S. Census Bureau. We match granted patents applied between 1976 and 2016 to the U.S. firms recorded in the Longitudinal Business Database (LBD) in the Census Bureau. Building on existing algorithms in the literature, we first use the assignee name, address (state and city), and year information to link the two datasets. We then introduce a novel search-aided algorithm that significantly improves the matching results by 7% and 2.9% at the patent and the assignee level, respectively. Overall, we are able to match 88.2% and 80.1% of all U.S. patents and assignees respectively. We contribute to the existing literature by 1) improving the match rates and quality with the web search-aided algorithm, and 2) providing the longest and longitudinally consistent crosswalk between patent assignees and LBD firms.
View Full
Paper PDF
-
Developing Content for the
Management and Organizational Practices Survey-Hospitals (MOPS-HP)
September 2021
Working Paper Number:
CES-21-25
Nationally representative U.S. hospital data does not exist on management practices, which have been shown to be related to both clinical and financial performance using past data collected in the World Management Survey (WMS). This paper describes the U.S. Census Bureau's development of content for the Management and Organizational Practices Survey Hospitals (MOPS-HP) that is similar to data collected in the MOPS conducted for the manufacturing sector in 2010 and 2015 and the 2009 WMS. Findings from cognitive testing interviews with 18 chief nursing officers and 13 chief financial officers at 30 different hospitals across 7 states and the District of Columbia led to using industry-tested terminology, to confirming chief nursing officers as MOPS-HP respondents and their ability to provide recall data, and to eliminating questions that tested poorly. Hospital data collected in the MOPS-HP would be the first nationally representative data on management practices with queries on clinical key performance indicators, financial and hospital-wide patient care goals, addressing patient care problems, clinical team interactions and staffing, standardized clinical protocols, and incentives for medical record documentation. The MOPS-HP's purpose is not to collect COVID-19 pandemic information; however, data measuring hospital management practices prior to and during the COVID-19 pandemic are a byproduct of the survey's one-year recall period (2019 and 2020).
View Full
Paper PDF
-
Redesigning the Longitudinal Business Database
May 2021
Working Paper Number:
CES-21-08
In this paper we describe the U.S. Census Bureau's redesign and production implementation of the Longitudinal Business Database (LBD) first introduced by Jarmin and Miranda (2002). The LBD is used to create the Business Dynamics Statistics (BDS), tabulations describing the entry, exit, expansion, and contraction of businesses. The new LBD and BDS also incorporate information formerly provided by the Statistics of U.S. Businesses program, which produced similar year-to-year measures of employment and establishment flows. We describe in detail how the LBD is created from curation of the input administrative data, longitudinal matching, retiming of economic census-year births and deaths, creation of vintage consistent industry codes and noise factors, and the creation and cleaning of each year of LBD data. This documentation is intended to facilitate the proper use and understanding of the data by both researchers with approved projects accessing the LBD microdata and those using the BDS tabulations.
View Full
Paper PDF