-
The Census Historical Environmental Impacts Frame
October 2024
Working Paper Number:
CES-24-66
The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
View Full
Paper PDF
-
Expanding the Frontier of Economic Statistics Using Big Data: A Case Study of Regional Employment
July 2024
Working Paper Number:
CES-24-37
Big data offers potentially enormous benefits for improving economic measurement, but it also presents challenges (e.g., lack of representativeness and instability), implying that their value is not always clear. We propose a framework for quantifying the usefulness of these data sources for specific applications, relative to existing official sources. We specifically weigh the potential benefits of additional granularity and timeliness, while examining the accuracy associated with any new or improved estimates, relative to comparable accuracy produced in existing official statistics. We apply the methodology to employment estimates using data from a payroll processor, considering both the improvement of existing state-level estimates, but also the production of new, more timely, county-level estimates. We find that incorporating payroll data can improve existing state-level estimates by 11% based on out-of-sample mean absolute error, although the improvement is considerably higher for smaller state-industry cells. We also produce new county-level estimates that could provide more timely granular estimates than previously available. We develop a novel test to determine if these new county-level estimates have errors consistent with official series. Given the level of granularity, we cannot reject the hypothesis that the new county estimates have an accuracy in line with official measures, implying an expansion of the existing frontier. We demonstrate the practical importance of these experimental estimates by investigating a hypothetical application during the COVID-19 pandemic, a period in which more timely and granular information could have assisted in implementing effective policies. Relative to existing estimates, we find that the alternative payroll data series could help identify areas of the country where employment was lagging. Moreover, we also demonstrate the value of a more timely series.
View Full
Paper PDF
-
Registered Report: Exploratory Analysis of Ownership Diversity and Innovation in the Annual Business Survey
March 2023
Working Paper Number:
CES-23-11
A lack of transparency in specification testing is a major contributor to the replicability crisis that has eroded the credibility of findings for informing policy. How diversity is associated with outcomes of interest is particularly susceptible to the production of nonreplicable findings given the very large number of alternative measures applied to several policy relevant attributes such as race, ethnicity, gender, or foreign-born status. The very large number of alternative measures substantially increases the probability of false discovery where nominally significant parameter estimates'selected through numerous though unreported specification tests'may not be representative of true associations in the population. The purpose of this registered report is to: 1) select a single measure of ownership diversity that satisfies explicit, requisite axioms; 2) split the Annual Business Survey (ABS) into an exploratory sample (35%) used in this analysis and a confirmatory sample (65%) that will be accessed only after the publication of this report; 3) regress self-reported new-to-market innovation on the diversity measure along with industry and firm-size controls; 4) pass through those variables meeting precision and magnitude criteria for hypothesis testing using the confirmatory sample; and 5) document the full set of hypotheses to be tested in the final analysis along with a discussion of the false discovery and family-wise error rate corrections to be applied. The discussion concludes with the added value of implementing split sample designs within the Federal Statistical Research Data Center system where access to data is strictly controlled.
View Full
Paper PDF
-
Methodology on Creating the U.S. Linked Retail Health Clinic (LiRHC) Database
March 2023
Working Paper Number:
CES-23-10
Retail health clinics (RHCs) are a relatively new type of health care setting and understanding the role they play as a source of ambulatory care in the United States is important. To better understand these settings, a joint project by the Census Bureau and National Center for Health Statistics used data science techniques to link together data on RHCs from Convenient Care Association, County Business Patterns Business Register, and National Plan and Provider Enumeration System to create the Linked RHC (LiRHC, pronounced 'lyric') database of locations throughout the United States during the years 2018 to 2020. The matching methodology used to perform this linkage is described, as well as the benchmarking, match statistics, and manual review and quality checks used to assess the resulting matched data. The large majority (81%) of matches received quality scores at or above 75/100, and most matches were linked in the first two (of eight) matching passes, indicating high confidence in the final linked dataset. The LiRHC database contained 2,000 RHCs and found that 97% of these clinics were in metropolitan statistical areas and 950 were in the South region of the United States. Through this collaborative effort, the Census Bureau and National Center for Health Statistics strive to understand how RHCs can potentially impact population health as well as the access and provision of health care services across the nation.
View Full
Paper PDF
-
Using Small-Area Estimation (SAE) to Estimate Prevalence of Child Health Outcomes at the Census Regional-, State-, and County-Levels
November 2022
Working Paper Number:
CES-22-48
In this study, we implement small-area estimation to assess the prevalence of child health outcomes at the county, state, and regional levels, using national survey data.
View Full
Paper PDF
-
Trade Liberalization and Labor-Market Outcomes: Evidence from US Matched Employer-Employee Data
September 2022
Working Paper Number:
CES-22-42
We use matched employer-employee data to examine outcomes among workers initially employed within and outside manufacturing after trade liberalization with China. We find that exposure to this shock operates predominantly through workers' counties (versus industries), that larger own industry and downstream exposure typically reduce relative earnings, and that greater upstream exposure often raises them. The latter is particularly important outside manufacturing: while we find substantial and persistent predicted declines in relative earnings among manufacturing workers, those outside manufacturing are generally predicted to experience relative earnings gains. Investigation of employment reactions indicates they account for a small share of the earnings effect.
View Full
Paper PDF
-
Decomposing Aggregate Productivity
July 2022
Working Paper Number:
CES-22-25
In this note, we evaluate the sensitivity of commonly-used decompositions for aggregate productivity. Our analysis spans the universe of U.S. manufacturers from 1977 to 2012 and we find that, even holding the data and form of the production function fixed, results on aggregate productivity are extremely sensitive to how productivity at the firm level is measured. Even qualitative statements about the levels of aggregate productivity and the sign of the covariance between productivity and size are highly dependent on how production function parameters are estimated. Despite these difficulties, we uncover some consistent facts about productivity growth: (1) labor productivity is consistently higher and less error-prone than measures of multi-factor productivity; (2) most productivity growth comes from growth within firms, rather than from reallocation across firms; (3) what growth does come from reallocation appears to be driven by net entry, primarily from the exit of relatively less-productive firms.
View Full
Paper PDF
-
Redesigning the Longitudinal Business Database
May 2021
Working Paper Number:
CES-21-08
In this paper we describe the U.S. Census Bureau's redesign and production implementation of the Longitudinal Business Database (LBD) first introduced by Jarmin and Miranda (2002). The LBD is used to create the Business Dynamics Statistics (BDS), tabulations describing the entry, exit, expansion, and contraction of businesses. The new LBD and BDS also incorporate information formerly provided by the Statistics of U.S. Businesses program, which produced similar year-to-year measures of employment and establishment flows. We describe in detail how the LBD is created from curation of the input administrative data, longitudinal matching, retiming of economic census-year births and deaths, creation of vintage consistent industry codes and noise factors, and the creation and cleaning of each year of LBD data. This documentation is intended to facilitate the proper use and understanding of the data by both researchers with approved projects accessing the LBD microdata and those using the BDS tabulations.
View Full
Paper PDF
-
Business Formation: A Tale of Two Recessions
January 2021
Working Paper Number:
CES-21-01
The trajectory of new business applications and transitions to employer businesses differ markedly during the Great Recession and COVID-19 Recession. Both applications and transitions to employer startups decreased slowly but persistently in the post-Lehman crisis period of the Great Recession. In contrast, during the COVID-19 Recession new applications initially declined but have since sharply rebounded, resulting in a surge in applications during 2020. Projected transitions to employer businesses also rise but this is dampened by a change in the composition of applications in 2020 towards applications that are more likely to be nonemployers.
View Full
Paper PDF
-
Total Error and Variability Measures for the Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in OnTheMap
September 2020
Working Paper Number:
CES-20-30
We report results from the first comprehensive total quality evaluation of five major indicators in the U.S. Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) Program Quarterly Workforce Indicators (QWI): total flow-employment, beginning-of-quarter employment, full quarter employment, average monthly earnings of full-quarter employees, and total quarterly payroll. Beginning-of-quarter employment is also the main tabulation variable in the LEHD Origin-Destination Employment Statistics (LODES) workplace reports as displayed in On-TheMap (OTM), including OnTheMap for Emergency Management. We account for errors due to coverage; record-level non response; edit and imputation of item missing data; and statistical disclosure limitation. The analysis reveals that the five publication variables under study are estimated very accurately for tabulations involving at least 10 jobs. Tabulations involving three to nine jobs are a transition zone, where cells may be fit for use with caution. Tabulations involving one or two jobs, which are generally suppressed on fitness-for-use criteria in the QWI and synthesized in LODES, have substantial total variability but can still be used to estimate statistics for untabulated aggregates as long as the job count in the aggregate is more than 10.
View Full
Paper PDF