CREAT - Census Bureau

Matching Compustat Data to the Longitudinal Business Database, 1976-2020

September 2025

Written by: Cristina Tello-Trillo, Lawrence Schmidt, Sean Streiff

Working Paper Number:

CES-25-65

Abstract

This paper details the methodology for creating an updated Compustat-Longitudinal Business Database (LBD) bridge, facilitating linkage between company identifiers in Compustat and firm identifiers in the LBD. In addition to data from Compustat, we incorporate historical data on public companies from various public and private sources, including information on executive names. Our methodology involves a series of stages using fuzzy name and address matching, including EIN, telephone number, and industry code matching. Qualified researchers with approved proposals can access this bridge though the Federal Statistical Research Data Centers. The Compustat-SSL bridge serves as a crucial resource for longitudinal studies on U.S. businesses, corporate governance, and executive compensation.

Document Tags and Keywords

Keywords:

information census, enterprise, database, census data, company, disclosure, corporation, executive, employee, corporate, merger, subsidiary, proprietor, consolidated, incorporated, department, record, census bureau, identifier, firm data

Tags:

Internal Revenue Service, Service Annual Survey, Securities and Exchange Commission, Office of Management and Budget, Company Organization Survey, Longitudinal Business Database, Center for Research in Security Prices, Michigan Institute for Teaching and Research in Economics, Employer Identification Numbers, North American Industry Classification System, Longitudinal Employer Household Dynamics, Business Register, Protected Identification Key, Census Bureau Disclosure Review Board, Person Validation System, Federal Statistical Research Data Center, Annual Survey of Entrepreneurs, Research and Development

Similar Working Papers

The 10 most similar working papers to the working paper 'Matching Compustat Data to the Longitudinal Business Database, 1976-2020' are listed below in order of similarity.

Working Paper

Documenting the Business Register and Related Economic Business Data

March 2016

Authors: Shawn Klimek, Frank Limehouse, Bethany DeSalvo

Working Paper Number:

CES-16-17

The Business Register (BR) is a comprehensive database of business establishments in the United States and provides resources for the U.S. Census Bureau's economic programs for sample selection, research, and survey operations. It is maintained using information from several federal agencies including the Census Bureau, Internal Revenue Service, Bureau of Labor Statistics, and the Social Security Administration. This paper provides a detailed description of the sources and functions of the BR. An overview of the BR as a linking tool and bridge to other Census Bureau data for additional business characteristics is also given.
View Full Paper PDF
Working Paper

Longitudinal Establishment And Enterprise Microdata (LEEM) Documentation

May 1998

Authors: Zoltan J Acs, Catherine Armington

Working Paper Number:

CES-98-09

This paper introduces and documents the new Longitudinal Enterprise and Establishment Microdata (LEEM) database, which has been constructed by Census' Economic Planning and Coordination Division under contract to the Office of Advocacy of the U.S. Small Business Administration. The LEEM links three years (1990, 1994, and 1995) of basic data for each private sector establishment with payroll in any of those years, along with data on the firm to which the establishment belongs each year. The LEEM data will facilitate both broader and more detailed analysis of patterns of job creation and destruction in the U.S., as well as research on the structure and dynamics of U.S. businesses. This paper provides documentation of the construction of LEEM data, summary data on most variables in the database, comparisons of the annual data with that of the nearly identical County Business Patterns, and distributions of establishments and their employment by the size of their firms. This is followed by a simple analysis of changes over time in the attributes of surviving establishments, and a brief discussion of turnover (business births and deaths) in the population and gross changes in employment associated with both establishment turnover and with surviving establishments. It concludes with a summary of the strengths and weaknesses of the LEEM.
View Full Paper PDF
Working Paper

A Guide to the MEPS-IC Government List Sample Microdata

September 2011

Authors: Alice Zawacki

Working Paper Number:

CES-11-27

The Medical Expenditure Panel Survey-Insurance Component (MEPS-IC) is conducted to provide nationally representative estimates on employer sponsored health insurance. MEPSIC data are collected from private sector employers, as well as state and local governments. While similar information is gathered from these two sectors, differences in the survey process exist. The goal of this paper is to provide details on the public sector including types of state and local government employers, sample design, general information on the data collected in the MEPS-IC, and additional sources of information.
View Full Paper PDF
Working Paper

The Industry R&D Survey: Patent Database Link Project

November 2006

Authors: Shihe Fu, William Kerr

Working Paper Number:

CES-06-28

This paper details the construction of a firm-year panel dataset combining the NBER Patent Dataset with the Industry R&D Survey conducted by the Census Bureau and National Science Foundation. The developed platform offers an unprecedented view of the R&D-to-patenting innovation process and a close analysis of the strengths and limitations of the Industry R&D Survey. The files are linked through a name-matching algorithm customized for uniting the firm names to which patents are assigned with the firm names in Census Bureau's SSEL business registry. Through the Census Bureau's file structure, this R&D platform can be linked to the operating performances of each firm's establishments, further facilitating innovation-to-productivity studies.
View Full Paper PDF
Working Paper

Methodology on Creating the U.S. Linked Retail Health Clinic (LiRHC) Database

March 2023

Authors: Alice Zawacki, Joey Marshall, Donald Cherry, Xianghua Yin, Brian W. Ward

Working Paper Number:

CES-23-10

Retail health clinics (RHCs) are a relatively new type of health care setting and understanding the role they play as a source of ambulatory care in the United States is important. To better understand these settings, a joint project by the Census Bureau and National Center for Health Statistics used data science techniques to link together data on RHCs from Convenient Care Association, County Business Patterns Business Register, and National Plan and Provider Enumeration System to create the Linked RHC (LiRHC, pronounced 'lyric') database of locations throughout the United States during the years 2018 to 2020. The matching methodology used to perform this linkage is described, as well as the benchmarking, match statistics, and manual review and quality checks used to assess the resulting matched data. The large majority (81%) of matches received quality scores at or above 75/100, and most matches were linked in the first two (of eight) matching passes, indicating high confidence in the final linked dataset. The LiRHC database contained 2,000 RHCs and found that 97% of these clinics were in metropolitan statistical areas and 950 were in the South region of the United States. Through this collaborative effort, the Census Bureau and National Center for Health Statistics strive to understand how RHCs can potentially impact population health as well as the access and provision of health care services across the nation.
View Full Paper PDF
Working Paper

Identifying U.S. Merchandise Traders: Integrating Customs Transactions with Business Administrative Data

September 2020

Authors: Fariha Kamal, Wei Ouyang

Working Paper Number:

CES-20-28

This paper describes the construction of the Longitudinal Firm Trade Transactions Database (LFTTD) enabling the identification of merchandise traders - exporters and importers - in the U.S. Census Bureau's Business Register (BR). The LFTTD links merchandise export and import transactions from customs declaration forms to the BR beginning in 1992 through the present. We employ a combination of deterministic and probabilistic matching algorithms to assign a unique firm identifier in the BR to a merchandise export or import transaction record. On average, we match 89 percent of export and import values to a firm identifier. In 1992, we match 79 (88) percent of export (import) value; in 2017, we match 92 (96) percent of export (import) value. Trade transactions in year t are matched to years between 1976 and t+1 of the BR. On average, 94 percent of the trade value matches to a firm in year t of the BR. The LFTTD provides the most comprehensive identification of and the foundation for the analysis of goods trading firms in the U.S. economy.
View Full Paper PDF
Working Paper

NEW DATA FOR DYNAMIC ANALYSIS: THE LONGITUDINAL ESTABLISHMENT AND ENTERPRISE MICRODATA (LEEM) FILE

December 1999

Authors: Alicia Robb

Working Paper Number:

CES-99-18

Until now, research on U.S. business activities over time has been hindered by the lack of accurate and comprehensive longitudinal data. The new Longitudinal Establishment and Enterprise Microdata (LEEM) are tremendously rich data that open up numerous possibilities for dynamic analyses of businesses in the U.S. economy. It is the first nationwide high-quality longitudinal database that covers the majority of employer businesses from all sectors of the economy. Due to the confidential nature of these data, the file is located at the Center for Economic Studies in the U.S. Bureau of the Census. To access the data, researchers must submit an acceptable proposal to CES and become sworn Census researchers. This paper describes the LEEM file, the variables contained on the file, and current uses of the data.
View Full Paper PDF
Working Paper

THE MANUFACTURING PLANT OWNERSHIP CHANGE DATABASE: ITS CONSTRUCTION AND USEFULNESS

September 1998

Authors: Sang V Nguyen

Working Paper Number:

CES-98-16

The Center for Economic Studies, U. S. Bureau of the Census, has constructed the "Manufacturing Plant Ownership Change Database" (OCD)using plant-level data taken from the Census Bureau's Longitudinal Research Database (LRD). The OCD contains data on all manufacturing establishments that have experienced ownership change at least once during the period 1963-1992 . This is a unique data set which, together with the LRD, can be used to conduct a variety of economic studies that were not possible before. This paper describes how the OCD was constructed and discusses the usefulness of these data for economic research.
View Full Paper PDF
Working Paper

New Uses of Health and Pension Information

January 2002

Authors: Julia I. Lane

Working Paper Number:

tp-2002-03

View Full Paper PDF
Working Paper

The Characteristics of Business Owners Database, 1992

May 1999

Authors: Brian Headd

Working Paper Number:

CES-99-08

This report describes the Characteristics of Business Owners (CBO), 1992 microdata available to researchers at the Center for Economic Studies and the CBO survey. The Bureau of the Census has conducted the 1982, 1987, and 1992 CBOs for the U.S. Small Business Administration, the Minority Business Development Agency, and the general public. For the 1992 CBO, there were three surveys, a sole proprietor survey, an owner survey for each owner in partnerships and S corporations, and a firm survey for each partnership and S corporation. For database purposes, the owner questions on the sole proprietors survey and owner survey were merged, and the firm questions on the sole proprietors survey and firm survey were merged. The owner database has 116,589 records, and the firm survey has 78,147 records. The CBO reports on owners about their background such as owner type (race, and ethnicity), age, education, work experience, veteran status, etc. The CBO reports on firms (with and without employees) about their economic details such as industry, financing, home-based, exporting, franchising, profits, etc. In addition, the CBO was conducted in 1996 on firms in existence in 1992 allowing for some survivability analysis. The CBO over samples women and minority owners to allow researchers to more reliably study these owners. This survey is an extension of the Survey of Minority-Owned Business Enterprises (SMOBE) and Survey of Women-Owned Businesses (WOB) within the economic census. The CBO is available as a report, special tabulations, or microdata for approved researchers.
View Full Paper PDF

Matching Compustat Data to the Longitudinal Business Database, 1976-2020

September 2025

Working Paper Number:

CES-25-65

Abstract

Document Tags and Keywords

The 10 most similar working papers to the working paper 'Matching Compustat Data to the Longitudinal Business Database, 1976-2020' are listed below in order of similarity.

March 2016

Working Paper Number:

CES-16-17

May 1998

Working Paper Number:

CES-98-09

September 2011

Working Paper Number:

CES-11-27

November 2006

Working Paper Number:

CES-06-28

March 2023

Working Paper Number:

CES-23-10

September 2020

Working Paper Number:

CES-20-28

December 1999

Working Paper Number:

CES-99-18

September 1998

Working Paper Number:

CES-98-16

January 2002

Working Paper Number:

tp-2002-03

May 1999

Working Paper Number:

CES-99-08