This paper details the methodology for creating an updated Compustat-Longitudinal Business Database (LBD) bridge, facilitating linkage between company identifiers in Compustat and firm identifiers in the LBD. In addition to data from Compustat, we incorporate historical data on public companies from various public and private sources, including information on executive names. Our methodology involves a series of stages using fuzzy name and address matching, including EIN, telephone number, and industry code matching. Qualified researchers with approved proposals can access this bridge though the Federal Statistical Research Data Centers. The Compustat-SSL bridge serves as a crucial resource for longitudinal studies on U.S. businesses, corporate governance, and executive compensation.
-
Documenting the Business Register and Related Economic Business Data
March 2016
Working Paper Number:
CES-16-17
The Business Register (BR) is a comprehensive database of business establishments in the United States and provides resources for the U.S. Census Bureau's economic programs for sample selection, research, and survey operations. It is maintained using information from several federal agencies including the Census Bureau, Internal Revenue Service, Bureau of Labor Statistics, and the Social Security Administration. This paper provides a detailed description of the sources and functions of the BR. An overview of the BR as a linking tool and bridge to other Census Bureau data for additional business characteristics is also given.
View Full
Paper PDF
-
Longitudinal Establishment And Enterprise Microdata (LEEM) Documentation
May 1998
Working Paper Number:
CES-98-09
This paper introduces and documents the new Longitudinal Enterprise and Establishment Microdata (LEEM) database, which has been constructed by Census' Economic Planning and Coordination Division under contract to the Office of Advocacy of the U.S. Small Business Administration. The LEEM links three years (1990, 1994, and 1995) of basic data for each private sector establishment with payroll in any of those years, along with data on the firm to which the establishment belongs each year. The LEEM data will facilitate both broader and more detailed analysis of patterns of job creation and destruction in the U.S., as well as research on the structure and dynamics of U.S. businesses. This paper provides documentation of the construction of LEEM data, summary data on most variables in the database, comparisons of the annual data with that of the nearly identical County Business Patterns, and distributions of establishments and their employment by the size of their firms. This is followed by a simple analysis of changes over time in the attributes of surviving establishments, and a brief discussion of turnover (business births and deaths) in the population and gross changes in employment associated with both establishment turnover and with surviving establishments. It concludes with a summary of the strengths and weaknesses of the LEEM.
View Full
Paper PDF
-
An Analysis of Key Differences in Micro Data: Results from the Business List Comparison Project
September 2008
Working Paper Number:
CES-08-28
The Bureau of Labor Statistics and the Bureau of the Census each maintain a business register, a universe of all U.S. business establishments and their characteristics, created from independent sources. Both registers serve critical functions such as supplying aggregate data inputs for certain national statistics generated by the Bureau of Economic Analysis. This paper examines key micro-level differences across these two business registers.
View Full
Paper PDF
-
Methodology on Creating the U.S. Linked Retail Health Clinic (LiRHC) Database
March 2023
Working Paper Number:
CES-23-10
Retail health clinics (RHCs) are a relatively new type of health care setting and understanding the role they play as a source of ambulatory care in the United States is important. To better understand these settings, a joint project by the Census Bureau and National Center for Health Statistics used data science techniques to link together data on RHCs from Convenient Care Association, County Business Patterns Business Register, and National Plan and Provider Enumeration System to create the Linked RHC (LiRHC, pronounced 'lyric') database of locations throughout the United States during the years 2018 to 2020. The matching methodology used to perform this linkage is described, as well as the benchmarking, match statistics, and manual review and quality checks used to assess the resulting matched data. The large majority (81%) of matches received quality scores at or above 75/100, and most matches were linked in the first two (of eight) matching passes, indicating high confidence in the final linked dataset. The LiRHC database contained 2,000 RHCs and found that 97% of these clinics were in metropolitan statistical areas and 950 were in the South region of the United States. Through this collaborative effort, the Census Bureau and National Center for Health Statistics strive to understand how RHCs can potentially impact population health as well as the access and provision of health care services across the nation.
View Full
Paper PDF
-
NEW DATA FOR DYNAMIC ANALYSIS: THE LONGITUDINAL ESTABLISHMENT AND ENTERPRISE MICRODATA (LEEM) FILE
December 1999
Working Paper Number:
CES-99-18
Until now, research on U.S. business activities over time has been hindered by the lack of accurate and comprehensive longitudinal data. The new Longitudinal Establishment and Enterprise Microdata (LEEM) are tremendously rich data that open up numerous possibilities for dynamic analyses of businesses in the U.S. economy. It is the first nationwide high-quality longitudinal database that covers the majority of employer businesses from all sectors of the economy. Due to the confidential nature of these data, the file is located at the Center for Economic Studies in the U.S. Bureau of the Census. To access the data, researchers must submit an acceptable proposal to CES and become sworn Census researchers. This paper describes the LEEM file, the variables contained on the file, and current uses of the data.
View Full
Paper PDF
-
A Guide to the MEPS-IC Government List Sample Microdata
September 2011
Working Paper Number:
CES-11-27
The Medical Expenditure Panel Survey-Insurance Component (MEPS-IC) is conducted to provide nationally representative estimates on employer sponsored health insurance. MEPSIC data are collected from private sector employers, as well as state and local governments. While similar information is gathered from these two sectors, differences in the survey process exist. The goal of this paper is to provide details on the public sector including types of state and local government employers, sample design, general information on the data collected in the MEPS-IC, and additional sources of information.
View Full
Paper PDF
-
The Industry R&D Survey: Patent Database Link Project
November 2006
Working Paper Number:
CES-06-28
This paper details the construction of a firm-year panel dataset combining the NBER Patent Dataset with the Industry R&D Survey conducted by the Census Bureau and National Science Foundation. The developed platform offers an unprecedented view of the R&D-to-patenting innovation process and a close analysis of the strengths and limitations of the Industry R&D Survey. The files are linked through a name-matching algorithm customized for uniting the firm names to which patents are assigned with the firm names in Census Bureau's SSEL business registry. Through the Census Bureau's file structure, this R&D platform can be linked to the operating performances of each firm's establishments, further facilitating innovation-to-productivity studies.
View Full
Paper PDF
-
A Guide To R&D Data At The Center For Economic Studies U.S. Bureau Of THe Census
August 1994
Working Paper Number:
CES-94-09
The National Science Foundation R&D Survey is an annual survey of firms' research and development expenditures. The survey covers 3000 firms reporting positive R&D. This paper provides a description of the R&D data available at the Center for Economic Studies (CES). The most basic data series available contains the original survey R&D data. It covers the years 1972-92. The remaining two series, although derived from the original files, specialize in particular items. The Mandatory Series contains required survey items for the years 1973-88. Items reported at firms' discretion are in the Voluntary Series, which covers the years 1974-89. Both of the derived series incorporate flags that track quality of the data. Both also include corrections to the data based on original hard copy survey evidence stored at CES. In addition to describing each dataset, we offer suggestions to researchers wishing to use the R&D data in exploring various economic issues. We report selected response rates, discuss the survey design, and provide hints on how to use the data.
View Full
Paper PDF
-
Using Matched Client And Census Data To Evaluate The Performance Of The Manufacturing Extension Partnership
April 1995
Working Paper Number:
CES-95-07
This paper proposes a framework for evaluating the Manufacturing Extension Partnership (MEP). The MEP is administered by the National Institute of Standards and Technology (NIST) as part of its effort to improve the global competitiveness of U.S. manufacturing industries. As the name implies, the MEP is modelled after agricultural extension. Rather than farmers the MEP's target population is small and medium sized manufacturers, generally those with less than 500 employees. The MEP currently supports 44 manufacturing extension centers around the country. These centers provide technical and business assistance for manufacturers much as county extension agents do for farmers. The goal of evaluation is to see if MEP engagements lead to positive outcomes from the view of important MEP stakeholders (e.g., MEP clients, MEP centers, NIST, state and local governments and Congress). These outcomes are discussed in McGuckin and Redman (1995) and include: Process Outcomes (e.g., adoption of a new technology by a client); Intermediate Outcomes (e.g., reduction in the clients defect rate); Business Outcomes (e.g., survival and profits) and Policy Outcomes (increases in employment,wages and/or exports). The evaluation framework described in this paper has two components. The first component is an evaluation dataset which contains measures of many of the program outcomes listed above for both MEP clients and a representative control group of non- clients. This dataset will be constructed by linking MEP client records with plant level Census data housed at the Center for Economic Studies of the Census Bureau. The Census data provides measures of several outcome and control variables which are comparable across both plants and time. The Census data include observations for all manufacturing plants in the U.S. from which representative control groups can be constructed. The MEP client records provide data on the type and intensity of extension engagements. Linking these rich sources of information yields a comprehensive and powerful dataset for MEP evaluation. The second component is an evaluation methodology which exploits this rich dataset to make statistical inferences about the impact of MEP services, while carefully controlling for other influences. By using this methodology, we can address many of the shortcomings which plagued previous attempts to evaluate extension services. In addition to evaluation, the dataset described in this paper may be used to profile the characteristics of MEP clients and compare them to non-clients. The Census data contain the complete universe of manufacturing establishments in the U.S.
View Full
Paper PDF
-
The Longitudinal Business Database
July 2002
Working Paper Number:
CES-02-17
As the largest federal statistical agency and primary collector of data on businesses, households and individuals, the Census Bureau each year conducts numerous surveys intended to provide statistics on a wide range of topics about the population and economy of the United States. The Census Bureau's decennial population and quinquennial economic censuses are unique, providing information on all U.S. households and business establishments, respectively.
View Full
Paper PDF