CREAT: Census Research Exploration and Analysis Tool

Registered Report: Exploratory Analysis of Ownership Diversity and Innovation in the Annual Business Survey

March 2023

Written by: Timothy R. Wojan

Working Paper Number:

CES-23-11

Abstract

A lack of transparency in specification testing is a major contributor to the replicability crisis that has eroded the credibility of findings for informing policy. How diversity is associated with outcomes of interest is particularly susceptible to the production of nonreplicable findings given the very large number of alternative measures applied to several policy relevant attributes such as race, ethnicity, gender, or foreign-born status. The very large number of alternative measures substantially increases the probability of false discovery where nominally significant parameter estimates'selected through numerous though unreported specification tests'may not be representative of true associations in the population. The purpose of this registered report is to: 1) select a single measure of ownership diversity that satisfies explicit, requisite axioms; 2) split the Annual Business Survey (ABS) into an exploratory sample (35%) used in this analysis and a confirmatory sample (65%) that will be accessed only after the publication of this report; 3) regress self-reported new-to-market innovation on the diversity measure along with industry and firm-size controls; 4) pass through those variables meeting precision and magnitude criteria for hypothesis testing using the confirmatory sample; and 5) document the full set of hypotheses to be tested in the final analysis along with a discussion of the false discovery and family-wise error rate corrections to be applied. The discussion concludes with the added value of implementing split sample designs within the Federal Statistical Research Data Center system where access to data is strictly controlled.

Document Tags and Keywords

Keywords Keywords are automatically generated using KeyBERT, a powerful and innovative keyword extraction tool that utilizes BERT embeddings to ensure high-quality and contextually relevant keywords.

By analyzing the content of working papers, KeyBERT identifies terms and phrases that capture the essence of the text, highlighting the most significant topics and trends. This approach not only enhances searchability but provides connections that go beyond potentially domain-specific author-defined keywords.
:
estimating, analysis, market, report, disclosure, research, ownership, ethnicity, specialization, innovation, economically, assessing, institutional

Tags Tags are automatically generated using a pretrained language model from spaCy, which excels at several tasks, including entity tagging.

The model is able to label words and phrases by part-of-speech, including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are identified to contain references to specific institutions, datasets, and other organizations.
:
National Science Foundation, Department of Energy, North American Industry Classification System, Census Bureau Disclosure Review Board, Disclosure Review Board, Accommodation and Food Services, Federal Statistical Research Data Center, National Center for Science and Engineering Statistics, Annual Business Survey

Similar Working Papers Similarity between working papers are determined by an unsupervised neural network model know as Doc2Vec.

Doc2Vec is a model that represents entire documents as fixed-length vectors, allowing for the capture of semantic meaning in a way that relates to the context of words within the document. The model learns to associate a unique vector with each document while simultaneously learning word vectors, enabling tasks such as document classification, clustering, and similarity detection by preserving the order and structure of words. The document vectors are compared using cosine similarity/distance to determine the most similar working papers. Papers identified with 🔥 are in the top 20% of similarity.

The 10 most similar working papers to the working paper 'Registered Report: Exploratory Analysis of Ownership Diversity and Innovation in the Annual Business Survey' are listed below in order of similarity.