-
Design Comparison of LODES and ACS Commuting Data Products
October 2014
Working Paper Number:
CES-14-38
The Census Bureau produces two complementary data products, the American Community Survey (ACS) commuting and workplace data and the Longitudinal Employer-Household Dynamics (LEHD) Origin-Destination Employment Statistics (LODES), which can be used to answer questions about spatial, economic, and demographic questions relating to workplaces and home-to-work flows. The products are complementary in the sense that they measure similar activities but each has important unique characteristics that provide information that the other measure cannot. As a result of questions from data users, the Census Bureau has created this document to highlight the major design differences between these two data products. This report guides users on the relative advantages of each data product for various analyses and helps explain differences that may arise when using the products.2,3
As an overview, these two data products are sourced from different inputs, cover different populations and time periods, are subject to different sets of edits and imputations, are released under different confidentiality protection mechanisms, and are tabulated at different geographic and characteristic levels. As a general rule, the two data products should not be expected to match exactly for arbitrary queries and may differ substantially for some queries.
Within this document, we compare the two data products by the design elements that were deemed most likely to contribute to differences in tabulated data. These elements are: Collection, Coverage, Geographic and Longitudinal Scope, Job Definition and Reference Period, Job and Worker Characteristics, Location Definitions (Workplace and Residence), Completeness of Geographic Information and Edits/Imputations, Geographic Tabulation Levels, Control Totals, Confidentiality Protection and Suppression, and Related
Public-Use Data Products.
An in-depth data analysis'in aggregate or with the microdata'between the two data products will be the subject of a future technical report. The Census Bureau has begun a pilot project to integrate ACS microdata with LEHD administrative data to develop an enhanced frame of employment status, place of work, and commuting. The Census Bureau will publish quality metrics for person match rates, residence and workplace match rates, and commute distance comparisons.
View Full
Paper PDF
-
RECOVERING THE ITEM-LEVEL EDIT AND IMPUTATION FLAGS IN THE 1977-1997 CENSUSES OF MANUFACTURES
September 2014
Working Paper Number:
CES-14-37
As part of processing the Census of Manufactures, the Census Bureau edits some data items and imputes for missing data and some data that is deemed erroneous. Until recently it was difficult for researchers using the plant-level microdata to determine which data items were changed or imputed during the editing and imputation process, because the edit/imputation processing flags were not available to researchers. This paper describes the process of reconstructing the edit/imputation flags for variables in the 1977, 1982, 1987, 1992, and 1997 Censuses of Manufactures using recently recovered Census Bureau files. Thepaper also reports summary statistics for the percentage of cases that are imputed for key variables. Excluding plants with fewer than 5 employees, imputation rates for several key variables range from 8% to 54% for the manufacturing sector as a whole, and from 1% to 72% at the 2-digit SIC industry level.
View Full
Paper PDF
-
Management in America
January 2013
Working Paper Number:
CES-13-01
The Census Bureau recently conducted a survey of management practices in over 30,000 plants across the US, the first large-scale survey of management in America. Analyzing these data reveals several striking results. First, more structured management practices are tightly linked to better performance: establishments adopting more structured practices for performance monitoring, target setting and incentives enjoy greater productivity and profitability, higher rates of innovation and faster employment growth. Second, there is a substantial dispersion of management practices across the establishments. We find that 18% of establishments have adopted at least 75% of these more structured management practices, while 27% of establishments adopted less than 50% of these. Third, more structured management practices are more likely to be found in establishments that export, who are larger (or are part of bigger firms), and have more educated employees. Establishments in the South and Midwest have more structured practices on average than those in the Northeast and West. Finally, we find adoption of structured management practices has increased between 2005 and 2010 for surviving establishments, particularly for those practices involving data collection and analysis.
View Full
Paper PDF
-
Dynamically Consistent Noise Infusion and Partially Synthetic Data as Confidentiality Protection Measures for Related Time Series
July 2012
Working Paper Number:
CES-12-13
The Census Bureau's Quarterly Workforce Indicators (QWI) provide detailed quarterly statistics on employment measures such as worker and job flows, tabulated by worker characteristics in various combinations. The data are released for several levels of NAICS industries and geography, the lowest aggregation of the latter being counties. Disclosure avoidance methods are required to protect the information about individuals and businesses that contribute to the underlying data. The QWI disclosure avoidance mechanism we describe here relies heavily on the use of noise infusion through a permanent multiplicative noise distortion factor, used for magnitudes, counts, differences and ratios. There is minimal suppression and no complementary suppressions. To our knowledge, the release in 2003 of the QWI was the first large-scale use of noise infusion in any official statistical product. We show that the released statistics are analytically valid along several critical dimensions { measures are unbiased and time series properties are preserved. We provide an analysis of the degree to which confidentiality is protected. Furthermore, we show how the judicious use of synthetic data, injected into the tabulation process, can completely eliminate suppressions, maintain analytical validity, and increase the protection of the underlying confidential data.
View Full
Paper PDF
-
LEHD Data Documentation LEHD-OVERVIEW-S2008-rev1
December 2011
Working Paper Number:
CES-11-43
View Full
Paper PDF
-
Management Challenges of the 2010 U.S. Census
August 2011
Working Paper Number:
CES-11-22
This paper gives an insider's perspective on the management approaches used to manage the 2010 Census during its operational phase. The approaches used, the challenges faced (in particular, difficulties faced in automating data collection), and the solutions applied to meet those challenges are described. Finally, six management lessons learned are presented.
View Full
Paper PDF
-
Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database
February 2011
Working Paper Number:
CES-11-04
In most countries, national statistical agencies do not release establishment-level business microdata, because doing so represents too large a risk to establishments\' confidentiality. One approach with the potential for overcoming these risks is to release synthetic data; that is, the released establishment data are simulated from statistical models designed to mimic the distributions of the underlying real microdata. In this article, we describe an application of this strategy to create a public use file for the Longitudinal Business Database, an annual economic census of establishments in the United States comprising more than 20 million records dating back to 1976. The U.S. Bureau of the Census and the Internal Revenue Service recently approved the release of these synthetic microdata for public use, making the synthetic Longitudinal Business Database the first-ever business microdata set publicly released in the United States. We describe how we created the synthetic data, evaluated analytical validity, and assessed disclosure risk.
View Full
Paper PDF
-
The Center for Economic Studies 1982-2007: A Brief History
October 2009
Working Paper Number:
CES-09-35
More than half a century ago, visionaries representing both the Census Bureau and the external research community laid the foundation for the Center for Economic Studies (CES) and the Research Data Center (RDC) system. They saw a clear need for a system meeting the inextricably related requirements of providing more and better information from existing Census Bureau data collections while preserving respondent confidentiality and privacy. CES opened in 1982 to house new longitudinal business databases, develop them further, and make them available to qualified researchers. CES and the RDC system evolved to meet the designers' requirements. Research at CES and the RDCs meets the commitments of the Census Bureau (and, recently, of other agencies) to preserving confidentiality while contributing paradigm-shifting fundamental research in a range of disciplines and up-to-the-minute critical tools for decision-makers.
View Full
Paper PDF
-
Resolving the Tension Between Access and Confidentiality: Past Experience and Future Plans at the U.S. Census Bureau
September 2009
Working Paper Number:
CES-09-33
This paper provides an historical context for access to U.S. Federal statistical data with a primary focus on the U.S. Census Bureau. We review the various modes used by the Census Bureau to make data available to users, and highlight the costs and benefits associated with each. We highlight some of the specific improvements underway or under consideration at the Census Bureau to better serve its data users, as well as discuss the broad strategies employed by statistical agencies to respond to the challenges of data access.
View Full
Paper PDF
-
Discretionary Disclosure in Financial Reporting: An Examination Comparing Internal Firm Data to Externally Reported Segment Data
September 2009
Working Paper Number:
CES-09-28
We use confidential, U.S. Census Bureau, plant-level data to investigate aggregation in external reporting. We compare firms' plant-level data to their published segment reports, conducting our tests by grouping a firm's plants that share the same four-digit SIC code into a 'pseudo-segment.' We then determine whether that pseudo-segment is disclosed as an external segment, or whether it is subsumed into a different business unit for external reporting purposes. We find pseudo-segments are more likely to be aggregated within a line-of-business segment when the agency and proprietary costs of separately reporting the pseudo-segment are higher and when firm and pseudo-segment characteristics allow for more discretion in the application of segment reporting rules. For firms reporting multiple external segments, aggregation of pseudo-segments is driven by both agency and proprietary costs. However, for firms reporting a single external segment, we find no evidence of an agency cost motive for aggregation.
View Full
Paper PDF