The Census Bureau produces two complementary data products, the American Community Survey (ACS) commuting and workplace data and the Longitudinal Employer-Household Dynamics (LEHD) Origin-Destination Employment Statistics (LODES), which can be used to answer questions about spatial, economic, and demographic questions relating to workplaces and home-to-work flows. The products are complementary in the sense that they measure similar activities but each has important unique characteristics that provide information that the other measure cannot. As a result of questions from data users, the Census Bureau has created this document to highlight the major design differences between these two data products. This report guides users on the relative advantages of each data product for various analyses and helps explain differences that may arise when using the products.2,3
As an overview, these two data products are sourced from different inputs, cover different populations and time periods, are subject to different sets of edits and imputations, are released under different confidentiality protection mechanisms, and are tabulated at different geographic and characteristic levels. As a general rule, the two data products should not be expected to match exactly for arbitrary queries and may differ substantially for some queries.
Within this document, we compare the two data products by the design elements that were deemed most likely to contribute to differences in tabulated data. These elements are: Collection, Coverage, Geographic and Longitudinal Scope, Job Definition and Reference Period, Job and Worker Characteristics, Location Definitions (Workplace and Residence), Completeness of Geographic Information and Edits/Imputations, Geographic Tabulation Levels, Control Totals, Confidentiality Protection and Suppression, and Related
Public-Use Data Products.
An in-depth data analysis'in aggregate or with the microdata'between the two data products will be the subject of a future technical report. The Census Bureau has begun a pilot project to integrate ACS microdata with LEHD administrative data to develop an enhanced frame of employment status, place of work, and commuting. The Census Bureau will publish quality metrics for person match rates, residence and workplace match rates, and commute distance comparisons.
-
Two Perspectives on Commuting: A Comparison of Home to Work Flows Across Job-Linked Survey and Administrative Files
January 2017
Working Paper Number:
CES-17-34
Commuting flows and workplace employment data have a wide constituency of users including urban and regional planners, social science and transportation researchers, and businesses. The U.S. Census Bureau releases two, national data products that give the magnitude and characteristics of home to work flows. The American Community Survey (ACS) tabulates households' responses on employment, workplace, and commuting behavior. The Longitudinal Employer-Household Dynamics (LEHD) program tabulates administrative records on jobs in the LEHD Origin-Destination Employment Statistics (LODES). Design differences across the datasets lead to divergence in a comparable statistic: county-to-county aggregate commute flows. To understand differences in the public use data, this study compares ACS and LEHD source files, using identifying information and probabilistic matching to join person and job records. In our assessment, we compare commuting statistics for job frames linked on person, employment status, employer, and workplace and we identify person and job characteristics as well as design features of the data frames that explain aggregate differences. We find a lower rate of within-county commuting and farther commutes in LODES. We attribute these greater distances to differences in workplace reporting and to uncertainty of establishment assignments in LEHD for workers at multi-unit employers. Minor contributing factors include differences in residence location and ACS workplace edits. The results of this analysis and the data infrastructure developed will support further work to understand and enhance commuting statistics in both datasets.
View Full
Paper PDF
-
Developing a Residence Candidate File for Use With Employer-Employee Matched Data
January 2017
Working Paper Number:
CES-17-40
This paper describes the Longitudinal Employer-Household Dynamics (LEHD) program's ongoing efforts to use administrative records in a predictive model that describes residence locations for workers. This project was motivated by the discontinuation of a residence file produced elsewhere at the U.S. Census Bureau. The goal of the Residence Candidate File (RCF) process is to provide the LEHD Infrastructure Files with residence information that maintains currency with the changing state of administrative sources and represents uncertainty in location as a probability distribution. The discontinued file provided only a single residence per person/year, even when contributing administrative data may have contained multiple residences. This paper describes the motivation for the project, our methodology, the administrative data sources, the model estimation and validation results, and the file specifications. We find that the best prediction of the person-place model provides similar, but superior, accuracy compared with previous methods and performs well for workers in the LEHD jobs frame. We outline possibilities for further improvement in sources and modeling as well as recommendations on how to use the preference weights in downstream processing.
View Full
Paper PDF
-
The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators
January 2006
Working Paper Number:
tp-2006-01
The Longitudinal Employer-Household Dynamics (LEHD) Program at the U.S. Census Bureau,
with the support of several national research agencies, has built a set of infrastructure files
using administrative data provided by state agencies, enhanced with information from other administrative
data sources, demographic and economic (business) surveys and censuses. The LEHD
Infrastructure Files provide a detailed and comprehensive picture of workers, employers, and their
interaction in the U.S. economy. Beginning in 2003 and building on this infrastructure, the Census
Bureau has published the Quarterly Workforce Indicators (QWI), a new collection of data series
that offers unprecedented detail on the local dynamics of labor markets. Despite the fine detail,
confidentiality is maintained due to the application of state-of-the-art confidentiality protection
methods. This article describes how the input files are compiled and combined to create the infrastructure
files. We describe the multiple imputation methods used to impute in missing data and
the statistical matching techniques used to combine and edit data when a direct identifier match
requires improvement. Both of these innovations are crucial to the success of the final product. Finally,
we pay special attention to the details of the confidentiality protection system used to protect
the identity and micro data values of the underlying entities used to form the published estimates.
We provide a brief description of public-use and restricted-access data files with pointers to further
documentation for researchers interested in using these data.
View Full
Paper PDF
-
LEHD Snapshot Documentation, Release S2021_R2022Q4
November 2022
Working Paper Number:
CES-22-51
The Longitudinal Employer-Household Dynamics (LEHD) data at the U.S. Census Bureau is a quarterly database of linked employer-employee data covering over 95% of employment in the United States. These data are used to produce a number of public-use tabulations and tools, including the Quarterly Workforce Indicators (QWI), LEHD Origin-Destination Employment Statistics (LODES), Job-to-Job Flows (J2J), and Post-Secondary Employment Outcomes (PSEO) data products. Researchers on approved projects may also access the underlying LEHD microdata directly, in the form of the LEHD Snapshot restricted-use data product. This document provides a detailed overview of the LEHD Snapshot as of release S2021_R2022Q4, including user guidance, variable codebooks, and an overview of the approvals needed to obtain access. Updates to the documentation for this and future snapshot releases will be made available in HTML format on the LEHD website.
View Full
Paper PDF
-
Social, Economic, Spatial, and Commuting Patterns of Informal Jobholders
April 2007
Working Paper Number:
tp-2007-02
A significant number of employees within the United States can be considered "informal" or
"off-the-books" workers. These workers, who by definition do not appear in administrative wage
records, are distinct from the larger group of private jobholders who do appear in administrative
records. However, while socioeconomic and spatial information on these individuals is readily
available in standard datasets, such as the 2000 Decennial Census Long Form, it is not possible
to identify the informal workers by only using such data because of the lack of accurate, formal
wage records. This study takes advantage of firm-based data that originates in Unemployment
Insurance administrative wage records linked with the Census Bureau's household-based data in
order to examine informal jobholders by their demographic characteristics as well as their
economic, commuting, and spatial location outcomes. In addition this report evaluates whether
informal jobholders should be included explicitly in future labor-workforce analyses and
transportation modeling. The analyses in this report use the sample of workers who lived in Los
Angeles County, California.
View Full
Paper PDF
-
Disclosure Limitation and Confidentiality Protection in Linked Data
January 2018
Working Paper Number:
CES-18-07
Confidentiality protection for linked administrative data is a combination of access modalities and statistical disclosure limitation. We review traditional statistical disclosure limitation methods and newer methods based on synthetic data, input noise infusion and formal privacy. We discuss how these methods are integrated with access modalities by providing three detailed examples. The first example is the linkages in the Health and Retirement Study to Social Security Administration data. The second example is the linkage of the Survey of Income and Program Participation to administrative data from the Internal Revenue Service and the Social Security Administration. The third example is the Longitudinal Employer-Household Dynamics data, which links state unemployment insurance records for workers and firms to a wide variety of censuses and surveys at the U.S. Census Bureau. For examples, we discuss access modalities, disclosure limitation methods, the effectiveness of those methods, and the resulting analytical validity. The final sections discuss recent advances in access modalities for linked administrative data.
View Full
Paper PDF
-
Confidentiality Protection in the Census Bureau Quarterly Workforce Indicators
February 2006
Working Paper Number:
tp-2006-02
The QuarterlyWorkforce Indicators are new estimates developed by the Census Bureau's Longitudinal
Employer-Household Dynamics Program as a part of its Local Employment Dynamics
partnership with 37 state Labor Market Information offices. These data provide detailed quarterly
statistics on employment, accessions, layoffs, hires, separations, full-quarter employment
(and related flows), job creations, job destructions, and earnings (for flow and stock categories of
workers). The data are released for NAICS industries (and 4-digit SICs) at the county, workforce
investment board, and metropolitan area levels of geography. The confidential microdata - unemployment
insurance wage records, ES-202 establishment employment, and Title 13 demographic
and economic information - are protected using a permanent multiplicative noise distortion factor.
This factor distorts all input sums, counts, differences and ratios. The released statistics are analytically
valid - measures are unbiased and time series properties are preserved. The confidentiality
protection is manifested in the release of some statistics that are flagged as "significantly distorted
to preserve confidentiality." These statistics differ from the undistorted statistics by a significant
proportion. Even for the significantly distorted statistics, the data remain analytically valid for
time series properties. The released data can be aggregated; however, published aggregates are
less distorted than custom postrelease aggregates. In addition to the multiplicative noise distortion,
confidentiality protection is provided by the estimation process for the QWIs, which multiply imputes
all missing data (including missing establishment, given UI account, in the UI wage record
data) and dynamically re-weights the establishment data to provide state-level comparability with
the BLS's Quarterly Census of Employment and Wages.
View Full
Paper PDF
-
Social, Economic, Spatial, and Commuting Patterns of Dual Jobholders
April 2007
Working Paper Number:
tp-2007-01
Individuals who hold multiple jobs have complex working lives and complex commuting
patterns. Economic and spatial information on these individuals is not readily available in
standard datasets, such as the 2000 Decennial Census Long Form, because the survey questions
were not designed to collect details on multiple jobs. This study takes advantage of firm-based
data from the Unemployment Insurance administrative wage records, linked with the Census
Bureau's household-based data, to examine multiple jobholders - and specifically a sentinel
group of dual jobholders. The study uses a sample from Los Angeles County, California and
examines the dual jobholders by their demographic characteristics as well as their economic,
commuting, and spatial location outcomes. In addition this report evaluates whether multiple
jobholders should be included explicitly in future labor-workforce analyses and transportation
modeling.
View Full
Paper PDF
-
Access Methods for United States Microdata
August 2007
Working Paper Number:
CES-07-25
Beyond the traditional methods of tabulations and public-use microdata samples, statistical agencies have developed four key alternatives for providing non-government researchers with access to confidential microdata to improve statistical modeling. The first, licensing, allows qualified researchers access to confidential microdata at their own facilities, provided certain security requirements are met. The second, statistical data enclaves, offer qualified researchers restricted access to confidential economic and demographic data at specific agency-controlled locations. Third, statistical agencies can offer remote access, through a computer interface, to the confidential data under automated or manual controls. Fourth, synthetic data developed from the original data but retaining the correlations in the original data have the potential for allowing a wide range of analyses.
View Full
Paper PDF
-
Disclosure Avoidance Techniques Used for the 1970 through 2010 Decennial Censuses of Population and Housing
November 2018
Working Paper Number:
CES-18-47
The U.S. Census Bureau conducts the decennial censuses under Title 13 of the U. S. Code with the Section 9 mandate to not 'use the information furnished under the provisions of this title for any purpose other than the statistical purposes for which it is supplied; or make any publication whereby the data furnished by any particular establishment or individual under this title can be identified; or permit anyone other than the sworn officers and employees of the Department or bureau or agency thereof to examine the individual reports (13 U.S.C. ' 9 (2007)).' The Census Bureau applies disclosure avoidance techniques to its publicly released statistical products in order to protect the confidentiality of its respondents and their data.
View Full
Paper PDF