CREAT: Census Research Exploration and Analysis Tool

Longitudinal Economic Data At The Census Bureau: A New Database Yields Fresh Insight On Some Old Issues

January 1990

Written by: Robert H Mcguckin

Working Paper Number:

CES-90-01

Abstract

This paper has two goals. First, it illustrates the importance of panel data with examples taken from research in progress using the U.S. Census Bureau's Longitudinal Research Database ( LRD ). Although the LRD is not the result of a "true" longitudinal survey, it provides both balanced and unbalanced panel data sets for establishments, firms, and lines of business. The second goal is to integrate the results of recent research with the LRD and to draw conclusions about the importance of longitudinal microdata for econometric research and time series analysis. The advantages of panel data arise from both the micro and time series aspects of the observations. This also leads us to consider why panel data are necessary to understand and interpret the time series behavior of aggregate statistics produced in cross-section establishment surveys and censuses. We find that typical homogeneity assumptions are likely to be inappropriate in a wide variety of applications. In particular, the industry in which an establishment is located, the ownership of the establishment, and the existence of the establishment (births and deaths) are endogenous variables that cannot simply be taken as time invariant fixed effects in econometric modeling.

Document Tags and Keywords

Keywords Keywords are automatically generated using KeyBERT, a powerful and innovative keyword extraction tool that utilizes BERT embeddings to ensure high-quality and contextually relevant keywords.

By analyzing the content of working papers, KeyBERT identifies terms and phrases that capture the essence of the text, highlighting the most significant topics and trends. This approach not only enhances searchability but provides connections that go beyond potentially domain-specific author-defined keywords.
:
estimation, econometric, macroeconomic, aggregation, statistical, quarterly, microdata, survey, aggregate, merger, average, accounting, empirical, yearly, statistician, surveys censuses, firms census, longitudinal, endogenous

Tags Tags are automatically generated using a pretrained language model from spaCy, which excels at several tasks, including entity tagging.

The model is able to label words and phrases by part-of-speech, including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are identified to contain references to specific institutions, datasets, and other organizations.
:
Department of Commerce, Census of Manufactures, Annual Survey of Manufactures, Bureau of Labor Statistics, Longitudinal Research Database, Center for Economic Studies, Bureau of Economic Analysis

Similar Working Papers Similarity between working papers are determined by an unsupervised neural network model know as Doc2Vec.

Doc2Vec is a model that represents entire documents as fixed-length vectors, allowing for the capture of semantic meaning in a way that relates to the context of words within the document. The model learns to associate a unique vector with each document while simultaneously learning word vectors, enabling tasks such as document classification, clustering, and similarity detection by preserving the order and structure of words. The document vectors are compared using cosine similarity/distance to determine the most similar working papers. Papers identified with 🔥 are in the top 20% of similarity.

The 10 most similar working papers to the working paper 'Longitudinal Economic Data At The Census Bureau: A New Database Yields Fresh Insight On Some Old Issues' are listed below in order of similarity.