The principle that the statistical system should provide flexibility-- possibilities for generating multiple groupings of data to satisfy multiple objectives--if it is to satisfy users is universally accepted. Yet in practice, this goal has not been achieved. This paper discusses the feasibility of providing flexibility in the statistical system to accommodate multiple uses of the industrial data now primarily examined within the Standard Industrial Classification (SIC) system. In one sense, the question of feasibility is almost trivial. With today's computer technology, vast amounts of data can be manipulated and stored at very low cost. Reconfigurations of the basic data are very inexpensive compared to the cost of collecting the data. Flexibility in the statistical system implies more than the technical ability to regroup data. It requires that the basic data are sufficiently detailed to support user needs and are processed and maintained in a fashion that makes the use of a variety of aggregation rules possible. For this to happen, statistical agencies must recognize the need for high quality microdata and build this into their planning processes. Agencies need to view their missions from a multiple use perspective and move away from use of a primary reporting and collection vehicle. Although the categories used to report data must be flexible, practical considerations dictate that data collection proceed within a fixed classification system. It is simply too expensive for both respondents and statistical agencies to process survey responses in the absence of standardized forms, data entry programs, etc. I argue for a basic classification centered on commodities--products, services, raw materials and labor inputs--as the focus of data collection. The idea is to make the principle variables of interest--the commodities--the vehicle for the collection and processing of the data. For completeness, the basic classification should include labor usage through some form of occupational classification. In most economic surveys at the Census Bureau, the reporting unit and the classified unit have been the establishment. But there is no need for this to be so. The basic principle to be followed in data collection is that the data should be collected in the most efficient way--efficiency being defined jointly in terms of statistical agency collection costs and respondent burdens.
-
The Importance of Establishment Data in Economic Research
August 1993
Working Paper Number:
CES-93-10
The importance and usefulness of establishment microdata for economic research and policy analysis is outlined and contrasted with traditional products of statistical agencies -- aggregate cross-section tabulations. It is argued that statistical agencies must begin to seriously rethink the way they view establishment data products.
View Full
Paper PDF
-
Longitudinal Economic Data At The Census Bureau: A New Database Yields Fresh Insight On Some Old Issues
January 1990
Working Paper Number:
CES-90-01
This paper has two goals. First, it illustrates the importance of panel data with examples taken from research in progress using the U.S. Census Bureau's Longitudinal Research Database ( LRD ). Although the LRD is not the result of a "true" longitudinal survey, it provides both balanced and unbalanced panel data sets for establishments, firms, and lines of business. The second goal is to integrate the results of recent research with the LRD and to draw conclusions about the importance of longitudinal microdata for econometric research and time series analysis. The advantages of panel data arise from both the micro and time series aspects of the observations. This also leads us to consider why panel data are necessary to understand and interpret the time series behavior of aggregate statistics produced in cross-section establishment surveys and censuses. We find that typical homogeneity assumptions are likely to be inappropriate in a wide variety of applications. In particular, the industry in which an establishment is located, the ownership of the establishment, and the existence of the establishment (births and deaths) are endogenous variables that cannot simply be taken as time invariant fixed effects in econometric modeling.
View Full
Paper PDF
-
Analytic Use Of Economic Microdata; A Model For Researcher Access With Confidentiality Protection
August 1992
Working Paper Number:
CES-92-08
A primary responsibility of the Center for Economic Studies (CES) of the U.S. Bureau of the Census is to facilitate researcher access to confidential economic microdata files. Benefits from this program accrue not only to policy makers--there is a growing awareness of the importance of microdata for analyzing both the descriptive and welfare implications of regulatory and environmental changes--but also and importantly to the statistical agencies themselves. In fact, there is substantial recent literature arguing for the proposition that the largest single improvement that the U.S. statistical system could make is to improve its analytic capabilities. In this paper I briefly discuss these benefits to greater access for analytical work and ways to achieve them. Due to the nature of business data, public use databases and masking technologies are not available as vehicles for releasing useful microdata files. I conclude that a combination of outside and inside research programs, carefully coordinated and integrated is the best model for ensuring that statistical agencies reap the gains from analytic data users. For the United States, at least, this is fortuitous with respect to justifying access since any direct research with confidential data by outsiders must have a "statistical purpose". Until the advent of CES, it was virtually impossible for researchers to work with the economic microdata collected by the various economic censuses. While the CES program is quite large, as it now stands, researchers, or their representatives, must come to the Census Bureau in Washington, D.C. to access the data. The success of the program has led to increasing demands for data access in facilities outside of the Washington, D.C. area. Two options are considered: 1) Establish Census Bureau facilities in various universities or similar nonprofit research facilities and 2) Develop CES regional operations in existing Census Bureau regional offices.
View Full
Paper PDF
-
ARE FIXED EFFECTS FIXED? Persistence in Plant Level Productivity
May 1996
Working Paper Number:
CES-96-03
Estimates of production functions suffer from an omitted variable problem; plant quality is an omitted variable that is likely to be correlated with variable inputs. One approach is to capture differences in plant qualities through plant specific intercepts, i.e., to estimate a fixed effects model. For this technique to work, it is necessary that differences in plant quality are more or less fixed; if the "fixed effects" erode over time, such a procedure becomes problematic, especially when working with long panels. In this paper, a standard fixed effects model, extended to allow for serial correlation in the error term, is applied to a 16-year panel of textile plants. This parametric approach strongly accepts the hypothesis of fixed effects. They account for about one-third of the variation in productivity. A simple non-parametric approach, however, concludes that differences in plant qualities erode over time, that is plant qualities f-mix. Monte Carlo results demonstrate that this discrepancy comes from the parametric approach imposing an overly restrictive functional form on the data; if there were fixed effects of the magnitude measured, one would reject the hypothesis of f-mixing. For textiles, at least, the functional form of a fixed effects model appears to generate misleading conclusions. A more flexible functional form is estimated. The "fixed" effects actually have a half life of approximately 10 to 20 years, and they account for about one-half the variation in productivity.
View Full
Paper PDF
-
Firm Performance And Evolution Empirical Regularities In The U.S. Microdata
October 1996
Working Paper Number:
CES-96-10
This paper presents a view of firm performance, industry evolution, and economic growth that contrasts with the traditional representative firm model. The paper reviews recent empirical work, primarily studies using the Longitudinal Research Database (LRD), that explicitly focuses on individual business units. The major empirical regularity in the studies is that heterogeneity is pervasive -- it is found across and within all sectors and across all plant characteristics. Further, firms are not only different in the cross-section. They enter at different times, make different choices, and react differently to economic shocks. Thus, to understand economic performance and competition, one must move beyond representative firm models. Competition must be understood as a process in which some firms choose correctly and grow while other firms choose poorly and die; the growth of the successful firms at the expense of less successful rivals drives economic growth.
View Full
Paper PDF
-
Unlocking the Information in Integrated Social Data
May 2002
Working Paper Number:
tp-2002-21
View Full
Paper PDF
-
Testing the Advantages of Using Product Level Data to Create Linkages Across Industrial Coding Systems
October 1993
Working Paper Number:
CES-93-14
After the major revision of the U.S. Standard Industrial Classification system (SIC) in the 1987, the problem arose of how to evaluate industrial performance over time. The revision resulted in the creation of new industries, the combination of old industries, and the remixing of other industries to better reflect the present U.S. economy. A method had to be developed to make the old and new sets of industries comparable over time. Ryten (1991) argues for performing the conversion at the "most micro level," the product level. Linking industries should be accomplished by reclassifying product data of each establishment to a standard system, reassigning the primary activity of the establishment, reaggregating the data to the industry level, and then making the desired statistical comparison (Ryten, 1991). This paper discusses linking the data at the very micro, product level, and at the more macro, industry level. The results suggest that with complete product information the product level conversion is preferable for most industries in manufacturing because it recognizes that establishments may switch their primary industry because of the conversion. For some industries, especially those having no substantial changes in SIC codes over time, the conversion at the industry level is fairly accurate. A small group of industries lacks complete product information in 1982 to link the 1982 product codes to the 1987 codes. This results in having to rely on the industry concordance to create a time series of statistics.
View Full
Paper PDF
-
Price Dispersion in U.S. Manufacturing
October 1989
Working Paper Number:
CES-89-07
This paper addresses the question of whether products in the U.S. Manufacturing sector sell at a single (common) price, or whether prices vary across producers. The question of price dispersion is important for two reasons. First, if prices vary across producers, the standard method of using industry price deflators leads to errors in measuring real output at the firm or establishment level. These errors in turn lead to biased estimates of the production function and productivity growth equation as shown in Abbott (1988). Second, if prices vary across producers, it suggests that producers do not take prices as given but use price as a competitive variable. This has several implications for how economists model competitive behavior.
View Full
Paper PDF
-
The Longitudinal Research Database (LRD): Status And Research Possibilities
July 1988
Working Paper Number:
CES-88-02
This paper discusses the development and use of the Longitudinal Research Data available at the Center for Economic Studies of the Bureau of the Census in terms of what has been accomplished thus far, what projects are currently in progress, and what plans are in place for the near future. The major achievement to date is the construction of the database itself, which contains data for manufacturing establishments collected by the Census in 1963, 1967, 1972, 1977 and 1982, and the Annual Survey of Manufactures for non-Census years from 1973 to 1985. These data now reside in the Center's computer in a consistent format across all years. In addition, a large software development task that greatly simplifies the task of selecting subsets of the database for specific research projects is well underway. Finally, a number of powerful microcomputers have been purchased for use by researchers for their statistical analysis. Current efforts underway at the Center include research on such policy-relevant issues as mergers and their impact on profits and production, high technology trade, import competition, plant level productivity, entry and exit, and productivity differences between large and small firms. Due to the confidentiality requirements of the Census data, most of their research is performed by Center staff and Special Sworn Employees. Under certain circumstances, the Center accepts user-written programs from outside researchers. These routines are executed by Center staff, and the resultant output is reviewed thoroughly for disclosure problems. The Center is also an active member of a task force working on methods on release "masked" or "cloned" microdata in public-use files that will protect the confidentiality of the data while at the same time provide a research tool for outside users. The Center research program contributes directly to future research possibilities. The current batch of research projects is adding insight into the nature of the LRD database. This information is continually being incorporated into the Center's software system, thus facilitating yet more research activity. Moreover, since a good portion of the research involves linking the Longitudinal Research Data to other data files, such as the NSF/Census R&D data, the scope of the databases is continually being expanded. Furthermore, the Center is exploring the possibility of linking the demographic data collected by the Census Bureau to the LRD database.
View Full
Paper PDF
-
Access Methods for United States Microdata
August 2007
Working Paper Number:
CES-07-25
Beyond the traditional methods of tabulations and public-use microdata samples, statistical agencies have developed four key alternatives for providing non-government researchers with access to confidential microdata to improve statistical modeling. The first, licensing, allows qualified researchers access to confidential microdata at their own facilities, provided certain security requirements are met. The second, statistical data enclaves, offer qualified researchers restricted access to confidential economic and demographic data at specific agency-controlled locations. Third, statistical agencies can offer remote access, through a computer interface, to the confidential data under automated or manual controls. Fourth, synthetic data developed from the original data but retaining the correlations in the original data have the potential for allowing a wide range of analyses.
View Full
Paper PDF