-
Matching Compustat Data to the Longitudinal Business Database, 1976-2020
September 2025
Working Paper Number:
CES-25-65
This paper details the methodology for creating an updated Compustat-Longitudinal Business Database (LBD) bridge, facilitating linkage between company identifiers in Compustat and firm identifiers in the LBD. In addition to data from Compustat, we incorporate historical data on public companies from various public and private sources, including information on executive names. Our methodology involves a series of stages using fuzzy name and address matching, including EIN, telephone number, and industry code matching. Qualified researchers with approved proposals can access this bridge though the Federal Statistical Research Data Centers. The Compustat-SSL bridge serves as a crucial resource for longitudinal studies on U.S. businesses, corporate governance, and executive compensation.
View Full
Paper PDF
-
Investments under Risk: Evidence from Hurricane Strikes
June 2025
Working Paper Number:
CES-25-43
We demonstrate that firms with plants in areas subject to a significant hurricane strike reduce their capital expenditures at the hurricane-affected plants and shift capital expenditures to plants in non-hurricane-affected areas. This effect is not present prior to 1997 and only appears from 1997 on. Our evidence is consistent with the possibility that a significant climate event such as the signing of the Kyoto Protocol raised the salience of the perceived risk from actual hurricane strikes and shifted firm behavior.
View Full
Paper PDF
-
Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children
January 2025
Working Paper Number:
CES-25-03
Researchers and practitioners increasingly rely on administrative data sources to measure family income. However, administrative data sources are often incomplete in their coverage of the population, giving rise to potential bias in family income measures, particularly if coverage deficiencies are not well understood. We focus on the school-aged child population, due to its particular import to research and policy, and because of the unique challenges of linking children to family income information. We find that two of the most significant administrative sources of family income information that permit linking of children and parents'IRS Form 1040 and SNAP participation records'usefully complement each other, potentially reducing coverage bias when used together. In a case study considering how best to measure economic disadvantage rates in the public school student population, we demonstrate the sensitivity of family income statistics to assumptions about individuals who do not appear in administrative data sources.
View Full
Paper PDF
-
The Privacy-Protected Gridded Environmental Impacts Frame
December 2024
Working Paper Number:
CES-24-74
This paper introduces the Gridded Environmental Impacts Frame (Gridded EIF), a novel privacy-protected dataset derived from the U.S. Census Bureau's confidential Environmental Impacts Frame (EIF) microdata infrastructure. The EIF combines comprehensive administrative records and survey data on the U.S. population with high-resolution geospatial information on environmental hazards. While access to the EIF is restricted due to the confidential nature of the underlying data, the Gridded EIF offers a broader research community the opportunity to glean insights from the data while preserving confidentiality. We describe the data and privacy protection process, and offer guidance on appropriate usage, presenting practical applications.
View Full
Paper PDF
-
Financing, Ownership, and Performance: A Novel, Longitudinal Firm-Level Database
December 2024
Working Paper Number:
CES-24-73
The Census Bureau's Longitudinal Business Database (LBD) underpins many studies of firm-level behavior. It tracks longitudinally all employers in the nonfarm private sector but lacks information about business financing and owner characteristics. We address this shortcoming by linking LBD observations to firm-level data drawn from several large Census Bureau surveys. The resulting Longitudinal Employer, Owner, and Financing (LEOF) database contains more than 3 million observations at the firm-year level with information about start-up financing, current financing, owner demographics, ownership structure, profitability, and owner aspirations ' all linked to annual firm-level employment data since the firm hired its first employee. Using the LEOF database, we document trends in owner demographics and financing patterns and investigate how these business characteristics relate to firm-level employment outcomes.
View Full
Paper PDF
-
Empirical Distribution of the Plant-Level Components of Energy and Carbon Intensity at the Six-digit NAICS Level Using a Modified KAYA Identity
September 2024
Working Paper Number:
CES-24-46
Three basic pillars of industry-level decarbonization are energy efficiency, decarbonization of energy sources, and electrification. This paper provides estimates of a decomposition of these three components of carbon emissions by industry: energy intensity, carbon intensity of energy, and energy (fuel) mix. These estimates are constructed at the six-digit NAICS level from non-public, plant-level data collected by the Census Bureau. Four quintiles of the distribution of each of the three components are constructed, using multiple imputation (MI) to deal with non-reported energy variables in the Census data. MI allows the estimates to avoid non-reporting bias. MI also allows more six-digit NAICS to be estimated under Census non-disclosure rules, since dropping non-reported observations may have reduced the sample sizes unnecessarily. The estimates show wide variation in each of these three components of emissions (intensity) and provide a first empirical look into the plant-level variation that underlies carbon emissions.
View Full
Paper PDF
-
Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children
March 2024
Working Paper Number:
CES-24-18
This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
View Full
Paper PDF
-
Connected and Uncooperative: The Effects of Homogenous and Exclusive Social Networks on Survey Response Rates and Nonresponse Bias
January 2024
Working Paper Number:
CES-24-01
Social capital, the strength of people's friendship networks and community ties, has been hypothesized as an important determinant of survey participation. Investigating this hypothesis has been difficult given data constraints. In this paper, we provide insights by investigating how response rates and nonresponse bias in the American Community Survey are correlated with county-level social network data from Facebook. We find that areas of the United States where people have more exclusive and homogenous social networks have higher nonresponse bias and lower response rates. These results provide further evidence that the effects of social capital may not be simply a matter of whether people are socially isolated or not, but also what types of social connections people have and the sociodemographic heterogeneity of their social networks.
View Full
Paper PDF
-
An In-Depth Examination of Requirements for Disclosure Risk Assessment
October 2023
Authors:
Ron Jarmin,
John M. Abowd,
Ian M. Schmutte,
Jerome P. Reiter,
Nathan Goldschlag,
Victoria A. Velkoff,
Michael B. Hawes,
Robert Ashmead,
Ryan Cumings-Menon,
Sallie Ann Keller,
Daniel Kifer,
Philip Leclerc,
Rolando A. Rodríguez,
Pavel Zhuravlev
Working Paper Number:
CES-23-49
The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
View Full
Paper PDF
-
Patents, Innovation, and Market Entry
September 2023
Working Paper Number:
CES-23-45
Do patents facilitate market entry and job creation? Using a 2014 Supreme Court decision that limited patent eligibility and natural language processing methods to identify invalid patents, I find that large treated firms reduce job creation and create fewer new establishments in response, with no effect on new firm entry. Moreover, companies shift toward innovation aimed at improving existing products consistent with the view that patents incentivize creative destruction.
View Full
Paper PDF