This note provides details about the construction of the NBER Patent Data-BR concordance, and is intended for researchers planning to use this concordance. In addition to describing the matching process used to construct the concordance, this note provides a discussion of the benefits and limitations of this concordance.
-
Improving Patent Assignee-Firm Bridge with Web Search Results
August 2022
Working Paper Number:
CES-22-31
This paper constructs a patent assignee-firm longitudinal bridge between U.S. patent assignees and firms using firm-level administrative data from the U.S. Census Bureau. We match granted patents applied between 1976 and 2016 to the U.S. firms recorded in the Longitudinal Business Database (LBD) in the Census Bureau. Building on existing algorithms in the literature, we first use the assignee name, address (state and city), and year information to link the two datasets. We then introduce a novel search-aided algorithm that significantly improves the matching results by 7% and 2.9% at the patent and the assignee level, respectively. Overall, we are able to match 88.2% and 80.1% of all U.S. patents and assignees respectively. We contribute to the existing literature by 1) improving the match rates and quality with the web search-aided algorithm, and 2) providing the longest and longitudinally consistent crosswalk between patent assignees and LBD firms.
View Full
Paper PDF
-
Business Dynamics of Innovating Firms: Linking U.S. Patents with Administrative Data on Workers and Firms
July 2015
Working Paper Number:
CES-15-19
This paper discusses the construction of a new longitudinal database tracking inventors and patent-owning firms over time. We match granted patents between 2000 and 2011 to administrative databases of firms and workers housed at the U.S. Census Bureau. We use inventor information in addition to the patent assignee firm name to and improve on previous efforts linking patents to firms. The triangulated database allows us to maximize match rates and provide validation for a large fraction of matches. In this paper, we describe the construction of the database and explore basic features of the data. We find patenting firms, particularly young patenting firms, disproportionally contribute jobs to the U.S. economy. We find patenting is a relatively rare event among small firms but that most patenting firms are nevertheless small, and that patenting is not as rare an event for the youngest firms compared to the oldest firms. While manufacturing firms are more likely to patent than firms in other sectors, we find most patenting firms are in the services and wholesale sectors. These new data are a product of collaboration within the U.S. Department of Commerce, between the U.S. Census Bureau and the U.S. Patent and Trademark Office.
View Full
Paper PDF
-
The Industry R&D Survey: Patent Database Link Project
November 2006
Working Paper Number:
CES-06-28
This paper details the construction of a firm-year panel dataset combining the NBER Patent Dataset with the Industry R&D Survey conducted by the Census Bureau and National Science Foundation. The developed platform offers an unprecedented view of the R&D-to-patenting innovation process and a close analysis of the strengths and limitations of the Industry R&D Survey. The files are linked through a name-matching algorithm customized for uniting the firm names to which patents are assigned with the firm names in Census Bureau's SSEL business registry. Through the Census Bureau's file structure, this R&D platform can be linked to the operating performances of each firm's establishments, further facilitating innovation-to-productivity studies.
View Full
Paper PDF
-
What Happens When Firms Patent? New Evidence from U.S. Economic Census Data
January 2008
Working Paper Number:
CES-08-03
In this study, we present novel statistics on the patenting in US manufacturing and new evidence on the question of what happens when firms patent. We do so by creating a comprehensive firm-patent matched dataset that links the NBER patent data (covering the universe of patents) to firm data from the US Census Bureau (which covers the universe of all firms with paid employees). Our linked dataset covers more than 48,000 unique assignees (compared to about 4,100 assignees covered by the Compustat-NBER link), representing almost two-thirds of all non-individual, non-university, non-government assignees from 1975 to 1997. We use the data to present some basic but novel statistics on the role of patenting in US manufacturing, including strong evidence confirming the highly skewed nature of patenting activity. Next, we examine what happens when firms patent by looking at a large sample of first time patentees. We find that while there are significant cross-sectional differences in size and total factor productivity between patentee firms and non-patentee firms, changes in patentownership status within firms is associated with a contemporaneous and substantial increase in firm size, but little to no change in total factor productivity. This evidence suggests that patenting is associated with firm growth through new product innovations (firm scope) rather than through reduction in the cost of producing existing products (firm productivity). Consistent with this explanation, we find that when firms patent, there is a contemporaneous increase in the number of products that the firms produce. Estimates of (within-firm) elasticity of firm characteristics to patent stock confirm our results. Our findings are robust to alternative measures of size and productivity, and to various sample selection criteria.
View Full
Paper PDF
-
Matching State Business Registration Records
to Census Business Data
January 2020
Working Paper Number:
CES-20-03
We describe our methodology and results from matching state Business Registration Records (BRR) to Census business data. We use data from Massachusetts and California to develop methods and preliminary results that could be used to guide matching data for additional states. We obtain matches to Census business records for 45% of the Massachusetts BRR records and 40% of the California BRR records. We find higher match rates for incorporated businesses and businesses with higher startup-quality scores as assigned in Guzman and Stern (2018). Clerical reviews show that using relatively strict matching on address is important for match accuracy, while results are less sensitive to name matching strictness. Among matched BRR records, the modal timing of the first match to the BR is in the year in which the BRR record was filed. We use two sets of software to identify matches: SAS DQ Match and a machine-learning algorithm described in Cuffe and Goldschlag (2018). We find preliminary evidence that while the ML-based method yields more match results, SAS DQ tends to result in higher accuracy rates. To conclude, we provide suggestions on how to proceed with matching other states' data in light of our findings using these two states.
View Full
Paper PDF
-
Characteristics of the Top R&D Performing Firms in the U.S.: Evidence from the Survey of Industrial R&D
September 2010
Working Paper Number:
CES-10-33
Innovation drives economic growth and productivity growth, and as such, indicators of innovative activity such as research and development (R&D) expenditures are of paramount importance. We combine Census confidential microdata from two sources in order to examine the characteristics of the top R&D performing firms in the U.S. economy. We use the Survey of Industrial Research and Development (SIRD) to identify the top 200 R&D performing firms in 2003 and, to the extent possible, to trace the evolution of these firms from 1957 to 2007. The Longitudinal Business Database (LBD) further extends our knowledge about these firms and enables us to make comparisons to the U.S. economy. By linking the SIRD and the LBD we are able to create a detailed portrait of the evolution of the top R&D performing firms in the U.S.
View Full
Paper PDF
-
The Longitudinal Business Database
July 2002
Working Paper Number:
CES-02-17
As the largest federal statistical agency and primary collector of data on businesses, households and individuals, the Census Bureau each year conducts numerous surveys intended to provide statistics on a wide range of topics about the population and economy of the United States. The Census Bureau's decennial population and quinquennial economic censuses are unique, providing information on all U.S. households and business establishments, respectively.
View Full
Paper PDF
-
Getting Patents and Economic Data to Speak to Each Other: An 'Algorithmic Links with Probabilities' Approach for Joint Analyses of Patenting and Economic Activity
September 2012
Working Paper Number:
CES-12-16
International technological diffusion is a key determinant of cross-country differences in economic performance. While patents can be a useful proxy for innovation and technological change and diffusion, fully exploiting patent data for such economic analyses requires patents to be tied to measures of economic activity. In this paper, we describe and explore a new algorithmic approach to constructing concordances between the International Patent Classification (IPC) system that organizes patents by technical features and industry classification systems that organize economic data, such as the Standard International Trade Classification (SITC), the International Standard Industrial Classification (ISIC) and the Harmonized System (HS). This 'Algorithmic Links with Probabilities' (ALP) approach incorporates text analysis software and keyword extraction programs and applies them to a comprehensive patent dataset. We compare the results of several ALP concordances to existing technology concordances. Based on these comparisons, we select a preferred ALP approach and discuss advantages of this approach relative to conventional approaches. We conclude with a discussion on some of the possible applications of the concordance and provide a sample analysis that uses our preferred ALP concordance to analyze international patent flows based on trade patterns.
View Full
Paper PDF
-
An 'Algorithmic Links with Probabilities' Crosswalk for USPC and CPC Patent Classifications with an Application Towards Industrial Technology Composition
March 2016
Working Paper Number:
CES-16-15
Patents are a useful proxy for innovation, technological change, and diffusion. However, fully exploiting patent data for economic analyses requires patents be tied to measures of economic activity, which has proven to be difficult. Recently, Lybbert and Zolas (2014) have constructed an International Patent Classification (IPC) to industry classification crosswalk using an 'Algorithmic Links with Probabilities' approach. In this paper, we utilize a similar approach and apply it to new patent classification schemes, the U.S. Patent Classification (USPC) system and Cooperative Patent Classification (CPC) system. The resulting USPC-Industry and CPC-Industry concordances link both U.S. and global patents to multiple vintages of the North American Industrial Classification System (NAICS), International Standard Industrial Classification (ISIC), Harmonized System (HS) and Standard International Trade Classification (SITC). We then use the crosswalk to highlight changes to industrial technology composition over time. We find suggestive evidence of strong persistence in the association between technologies and industries over time.
View Full
Paper PDF
-
Reconciling the Firm Size and Innovation Puzzle
March 2016
Working Paper Number:
CES-16-20RR
There is a prevailing view in both the academic literature and the popular press that firms need to behave more entrepreneurially. This view is reinforced by a stylized fact in the innovation literature that R&D productivity decreases with size. However, there is a second stylized fact in the innovation literature that R&D investment increases with size. Taken together, these stylized facts create a puzzle of seemingly irrational behavior by large firms--they are increasing spending despite decreasing returns. This paper is an effort to resolve that puzzle. We propose and test two alternative resolutions: 1) that it arises from mismeasurement of R&D productivity, and 2) that firm size endogenously drives R&D strategy, and that the returns to R&D strategies depend on scale. We are able to resolve the puzzle under the first tack--using a recent measure of R&D productivity, RQ, we find that both R&D spending and R&D productivity increase with scale. We had less success with the second tack--while firm size affects R&D strategy in the manners expected by theory, there is no strategy whose returns decrease in scale. Taken together, our results are consistent with the Schumpeter view that large firms are the major engine of growth, they both spend more in aggregate than small firms, and are more productive with that spending. Moreover the prescription that firms should behave more entrepreneurially, should be treated with caution--one small firm strategy has lower returns to scale than its large firm counterpart.
View Full
Paper PDF