1. Introduction

The Longitudinal Employer-Household Dynamics (LEHD) data at the U.S. Census Bureau is a quarterly database of linked employer-employee data covering over 95% of employment in the United States. 1 The LEHD data are generated by merging previously collected survey and administrative data on jobs, businesses, and workers. By integrating administrative data with existing census and surveys, a national longitudinal jobs database for the U.S. is generated at very low cost and with no additional respondent burden. The LEHD microdata is available to approved researchers in the Federal Statistical Research Data Center (FSRDC) network. LEHD data is also available to the public in tabular form in the Census Bureau’s Quarterly Workforce Indicators (QWI), LEHD Origin-Destination Employment Statistics (LODES), Job-to-Job Flows (J2J), and the Post-Secondary Employment Outcomes (PSEO) data.

In order to facilitate researcher use of the microdata, LEHD infrastructure files are periodically collected into a standardized form known as the LEHD Snapshot, which is then released into the FSRDC network. The purpose of this documentation and codebook is to provide a limited overview of the components of the LEHD Snapshot files, as well as detailed codebooks for researcher use. For an overview of the LEHD infrastructure, see Abowd et al. [2009]. This documentation is not intended to provide details of specific LEHD data products, or of the disclosure limitation measures used to produce those products. For more information on specific LEHD data products and their disclosure limitation techniques, see Abowd et al. [2006], Haney et al. [2017], Machanavajjhala et al. [2008] .

The LEHD Snapshot consists almost exclusively of restricted-use administrative microdata. Therefore, as with all projects using restricted data, any projects using the LEHD Snapshot must undergo standard review and approval processes. Researchers are required to demonstrate that they cannot conduct their research using public-use data sources, and they must identify the ways in which their project will benefit both the objectives of the U.S. Census Bureau and the broader research community. Before requesting access to the LEHD Snapshot, researchers are encouraged to examine the public data offerings of the LEHD program at https://lehd.ces.census.gov/data/, as well as the broader public offerings at https://data.census.gov/. For more details on the FSRDC program and the requirements for project approval, see the FSRDC about page as well as the approval guide below.

This documentation is current as of LEHD Snapshot release S2022_R2023Q4, which contains available data through 2022. Features may change across Snapshot releases. Updated versions of the Snapshot documentation are published in HTML format, and the most recent Snapshot is publicly available at https://lehd.ces.census.gov/data/lehd-snapshot-doc/latest/. References to documentation for earlier Snapshot releases can be found below. For details on how to cite the data and this documentation, see below.

1.1. Earnings Coverage

Covered private-industry employment in the LEHD data includes most corporate officials, all executives, all supervisory personnel, all professionals, all clerical workers, many farmworkers, all wage earners, all pieceworkers, and all part-time workers. Workers on paid sick leave, paid holiday, paid vacation, and the like are also covered. Workers on the payroll of more than one firm during the period are counted by each employer that is subject to UI, as long as those workers satisfy the preceding definition of employment. Workers have UI wages filed in every quarter they are covered, even though their wages may not be subject to UI tax in the latter months of the year.

Notable exclusions from UI coverage among private sector employers are independent contractors, the unincorporated self-employed, railroad workers covered by the railroad unemployment insurance system, some family employees of family-owned businesses, certain farm workers, students working for universities under certain cooperative programs, salespersons primarily paid on commission, and workers of some non-profits. States have some leeway in designating coverage, for a complete list, see the coverage section of the most recent Comparison of State UI Laws.

Covered employment in the LEHD data includes most employees of state and local governments with the exception of elected officials, members of a legislative body or members of the judiciary, and some emergency employees.

Federal government workers are not covered by state UI and do not appear in UI wage record data.

1.2. LEHD Confidential Microdata: Job, Employer, and Person-Level Files

The LEHD Snapshot is organized into sets of tables providing job-level, employer-level, and person-level data. The scope of each table is either national (all states) or a single state. National tables will have a US in the table name, while state tables will have the two-character state postal code. For purposes of this documentation, the postal code will be replaced with ZZ. Tables that are national in scope will not have the complete panel of states in all quarters because of limited availability of state job data both at the beginning and end of the time series. For more details, see Availability of Data.

1.2.1. Job-Level Files

Job-level data provide earnings for jobs, which is a link between a worker and a firm. The employment coverage in the LEHD job-level microdata files described below is UI-covered employment only (see Earnings Coverage for details on covered employment and wages). Researchers interested in federal jobs should request the (OPM) microdata separately as part of their research request.

Job-Level Files in the LEHD Snapshot

Table

Description

Scope

Key

Earnings Availability

(EHF_ALL_AVAILABILITY)

Indicates data availability ranges for all states.

National

State

Earnings Indicators

(EHF_US_INDICATORS)

Indicates whether or not a PIK had any earnings reported in any state in the LEHD system.

National

PIK

YEAR

Employment History File

(EHF_ZZ)

Quarterly earnings for each job in a state as reported by the employer to the state’s UI system.

State

PIK

SEIN

SEINUNIT

YEAR

Job History File

(JHF_ZZ)

Wide version of earnings history that contains imputed establishment (SEINUNITs) for multi-unit firms, and a job-level ID useful for tracking job spells across employer identifier changes.

State

PIK

SEIN

SPELL_U2W

1.2.2. Employer-Level Files

Employer-level data contain characteristics of firms, as well as summary measures calculated from job data.

Employer-Level Files in the LEHD Snapshot

Table

Description

Scope

Key

Employer Characteristics File, Firm

(ECF_ZZ_SEIN)

Sourced from the SEINUNIT file, this file rolls up employer characteristics to the SEIN level.

State

SEIN

YEAR

QUARTER

Employer Characteristics File, Establishment

(ECF_ZZ_SEINUNIT)

An establishment level file with industry, size, and location.

State

SEIN

YEAR

QUARTER

SEINUNIT

Employer Characteristics File, National Firm

(ECF_ZZ_SEIN_T26)

Contains the national firm age and size data for employers in the ECF. This file requires IRS approval.

State

SEIN

YEAR

QUARTER

Quarterly Workforce Indicators

(QWI_ZZ_SEINUNIT)

The microdata version of the public use QWI data, containing establishment-level hires, separations, net job growth, and average earnings.

State

SEIN

YEAR

QUARTER

SEINUNIT

Successor-Predecessor File

(SPF_ZZ)

Flows of separations and hires between employers (SEINs) used to identify successor-predecessor relationships.

State

SEIN

SEIN_SUCC

QTIME

1.2.3. Person-Level Files

Person-level tables provide demographic characteristics of individuals and observed residence geography.

Person-Level Files in the LEHD Snapshot

Table

Description

Scope

Key

Individual Characteristics File

(ICF_US)

Demographics, place of birth, and education of workers.

National

PIK

Date of Birth/Sex/Place of Birth Implicates

(ICF_US_IMPLICATES_AGE_SEX_POB)

Multiply imputed date of birth, sex, and place of birth variables to complete missing information.

National

PIK

Education Implicates

(ICF_US_IMPLICATES_EDUCATION)

Multiply imputed education variables to complete missing information.

National

PIK

Race/Ethnicity Implicates

(ICF_US_IMPLICATES_RACE_ETHNICITY)

Multiply imputed race and ethnicity variables to complete missing information.

National

PIK

Residence Geography, 1999-2011

(ICF_US_RESIDENCE_CPR)

Residence geography prior to 2012 for individuals found in the wage data.

National

PIK

ADDRESS_YEAR

Residence Geography, 2012 forward

(ICF_US_RESIDENCE_RCF)

Residence geography from 2012 forward for individuals found in the wage data.

National

PIK

ADDRESS_YEAR

1.3. Linking the job, employer, and person data

  • To link worker demographics to the jobs data, link the EHF or JHF to the ICF US via the PIK.

  • To attach the characteristics of the employer (SEIN) to the job, link the EHF or JHF to the ECF SEIN using the SEIN.

  • To attach the characteristics of the establishment to the job, use the first implicate of the worker-to-establishment imputation (SEINUNIT1) on the JHF to link to the ECF SEINUNIT table.

    • All implicates (SEINUNIT1-SEINUNIT10) can be used to produce standard errors that account for imputation variability [McKinney et al., 2021].

    • Because of longitudinal restrictions on establishment imputation, jobs may be active at establishments in quarters that the establishment does not appear on the ECF. See Adding Establishment Characteristics for more information.

1.4. Approvals Needed for FSRDC Research

In addition to Census data, the administrative records that combine to form the LEHD infrastructure derive from several different federal and state agencies. Approval of a proposed FSRDC project by other agencies may be required, as follows:

  • Requests for state-level tables will follow the procedure specified by the Memorandum of Understanding (MOU) with that state. Most commonly, the state agency is given the opportunity to approve proposed projects.

  • Person-level demographic characteristics (ICF) require the approval of the Social Security Administration (SSA).

  • Employer-level Title 26 (T26) data requires the additional approval of the Internal Revenue Service (IRS). This includes the Federal EIN, as well as age and size of the national firm.

The codebook for each table will indicate which approvals are required.

The job-level tables largely consist of state-provided data and so go through the LEHD state data owner approval process. The indicator and availability files do not require state approval. There is no federal tax information or Social Security administrative data in any of the job-level tables, so no additional IRS or SSA approvals are required.

1.5. Snapshot Packages in the FSRDC

LEHD Snapshot tables are delivered in packages according to the level of data in the tables. Tables that require additional approvals are provided in separate packages.

Snapshot Packages in the FSRDC

Package

Tables

Jobs Data

EHF_ZZ

JHF_ZZ

EHF_US_INDICATORS

EHF_ALL_AVAILABILITY

Employer Data

(non-T26)

ECF_ZZ_SEIN

ECF_ZZ_SEINUNIT

SPF_ZZ

QWI_ZZ_SEINUNIT

Employer Data

(T26)

ECF_ZZ_SEIN_T26

Person Demographics

(Workers Only)

ICF_US

ICF_US_IMPLICATES_AGE_SEX_POB

ICF_US_IMPLICATES_RACE_ETHNICITY

ICF_US_IMPLICATES_EDUCATION

Person Residence

(Workers Only)

ICF_US_RESIDENCE_CPR

ICF_US_RESIDENCE_RCF

1.6. Quarter Indexing - qtime

On several LEHD tables, the year and quarter is replaced by or supplemented with an index for ease of counting quarters in longitudinal references. This is initialized with 1985Q1 being set to 1. The following SAS formulae can be used to switch back and forth:

  • qtime = (year-1985) * 4 + quarter

  • year = int((qtime-1)/4) + 1985

  • quarter = mod(qtime-1, 4) + 1

For a table of all possible qtime values with year/quarter refererences see codebook.

1.7. Availability of Data

The longitudinal availablity of data in the LEHD Snapshot will vary by type of data and the data source, as described in this section.

1.7.1. Job and Employer Data (State Provided)

The following table indicates the availability of LEHD data by state for the EHF, ECF, and QWI tables. This includes all states that are or have been members of the LED partnership. In general, LEHD snapshot release S2022_R2023Q4 contains all available data through 2022.

  • The EHF range represents all of the quarters of UI wage data that have been received.

  • The QWI range may begin later than the EHF range if certain data quarters do not meet quality standards for publication in LED data products.

    • Earnings quarters on the JHF are restricted to the QWI range.

  • The ECF range may start before the EHF range if employer-level data submissions (QCEW) are available prior to the beginning of the UI wage data series.

  • End dates will usually be the same for all tables within a state. States that are not active partners may not be available for the latest quarters.

Available Quarters of LEHD State Provided Data

State

ECF Start

EHF Start

QWI/JHF Start

ECF End

EHF End

QWI/JHF End

Alabama

2001Q1

2001Q1

2001Q1

2023Q1

2023Q1

2023Q1

Alaska

1990Q1

1990Q1

2000Q1

2016Q2

2016Q2

2016Q2

Arizona

1992Q1

1992Q1

2004Q1

2023Q1

2023Q1

2023Q1

Arkansas

2002Q3

2002Q3

2002Q3

2018Q2

2018Q2

2018Q2

California

1991Q1

1991Q3

1991Q3

2023Q1

2023Q1

2023Q1

Colorado

1990Q1

1990Q1

1993Q2

2023Q1

2023Q1

2023Q1

Connecticut

1996Q1

1996Q1

1996Q1

2023Q1

2023Q1

2023Q1

Delaware

1997Q1

1998Q3

1998Q3

2023Q1

2023Q1

2023Q1

District of Columbia

2000Q4

2002Q2

2005Q2

2023Q1

2023Q1

2023Q1

Florida

1989Q1

1992Q4

1997Q4

2023Q1

2023Q1

2023Q1

Georgia

1994Q1

1994Q1

1998Q1

2023Q1

2023Q1

2023Q1

Hawaii

1995Q4

1995Q4

1995Q4

2023Q1

2023Q1

2023Q1

Idaho

1990Q1

1990Q1

1991Q1

2023Q1

2023Q1

2023Q1

Illinois

1990Q1

1990Q1

1993Q2

2023Q1

2023Q1

2023Q1

Indiana

1990Q1

1990Q1

1998Q1

2023Q1

2023Q1

2023Q1

Iowa

1990Q1

1998Q4

1998Q4

2023Q1

2023Q1

2023Q1

Kansas

1990Q1

1990Q1

1993Q1

2023Q1

2023Q1

2023Q1

Kentucky

1996Q4

1996Q4

2001Q1

2023Q1

2023Q1

2023Q1

Louisiana

1990Q1

1990Q1

1995Q1

2023Q1

2023Q1

2023Q1

Maine

1996Q1

1996Q1

1996Q2

2023Q1

2023Q1

2023Q1

Maryland

1985Q4

1985Q4

1990Q1

2023Q1

2023Q1

2023Q1

Massachusetts

2002Q1

2002Q1

2010Q1

2023Q1

2023Q1

2023Q1

Michigan

1998Q1

1998Q1

2000Q3

2021Q4

2021Q4

2021Q4

Minnesota

1994Q3

1994Q3

1994Q3

2023Q1

2023Q1

2023Q1

Mississippi

2003Q3

2003Q3

2003Q3

2018Q2

2018Q2

2018Q2

Missouri

1990Q1

1990Q1

1995Q1

2023Q1

2023Q1

2023Q1

Montana

1993Q1

1993Q1

1993Q1

2023Q1

2023Q1

2023Q1

Nebraska

1999Q1

1999Q1

1999Q1

2023Q1

2023Q1

2023Q1

Nevada

1998Q1

1998Q1

1998Q1

2023Q1

2023Q1

2023Q1

New Hampshire

2003Q1

2003Q1

2003Q1

2023Q1

2023Q1

2023Q1

New Jersey

1995Q1

1996Q1

1996Q1

2023Q1

2023Q1

2023Q1

New Mexico

1990Q1

1995Q3

1995Q3

2023Q1

2023Q1

2023Q1

New York

1990Q1

1995Q1

2000Q1

2023Q1

2023Q1

2023Q1

North Carolina

1990Q1

1991Q1

1992Q4

2022Q3

2022Q3

2022Q3

North Dakota

1998Q1

1998Q1

1998Q1

2023Q1

2023Q1

2023Q1

Ohio

1994Q1

1994Q1

2000Q1

2023Q1

2023Q1

2023Q1

Oklahoma

1999Q1

2000Q1

2000Q1

2023Q1

2023Q1

2023Q1

Oregon

1990Q1

1991Q1

1991Q1

2023Q1

2023Q1

2023Q1

Pennsylvania

1991Q1

1991Q1

1997Q1

2023Q1

2023Q1

2023Q1

Puerto Rico

2001Q1

2001Q1

2010Q1

2023Q1

2023Q1

2023Q1

Rhode Island

1990Q1

1995Q1

1995Q1

2023Q1

2023Q1

2023Q1

South Carolina

1998Q1

1998Q1

1998Q1

2023Q1

2023Q1

2023Q1

South Dakota

1994Q1

1994Q1

1998Q1

2023Q1

2023Q1

2023Q1

Tennessee

1998Q1

1998Q1

1998Q1

2023Q1

2023Q1

2023Q1

Texas

1990Q1

1995Q1

1995Q1

2023Q1

2023Q1

2023Q1

Utah

1990Q1

1999Q1

1999Q3

2023Q1

2023Q1

2023Q1

Vermont

2000Q1

2000Q1

2000Q1

2023Q1

2023Q1

2023Q1

Virginia

1995Q3

1998Q1

1998Q3

2023Q1

2023Q1

2023Q1

Washington

1990Q1

1990Q1

1990Q1

2023Q1

2023Q1

2023Q1

West Virginia

1990Q1

1997Q1

1997Q1

2023Q1

2023Q1

2023Q1

Wisconsin

1990Q1

1990Q1

1990Q1

2023Q1

2023Q1

2023Q1

Wyoming

1992Q1

1992Q1

2001Q1

2023Q1

2023Q1

2023Q1

1.7.2. Residence Data

Annual residence data is provided for workers who appear in the Job data. Residences are sourced from the Composite Person Record (CPR) tables or the Residential Candidates File (RCF). The year ranges for which each data source is available follow.

Available Years of LEHD Residence Data

Source

Start Year

End Year

CPR

1999

2011

RCF

2012

2021

1.8. Previous Versions of Snapshot Documentation

Links to previous versions of documentation can be accessed in the table below. Please note that documentation for Snapshot versions prior to S2021 are in working paper format.

Previous Versions of Snapshot Documentation

Version

Link

S2022_R2023Q4

2022 Snapshot

S2021_R2022Q4

2021 Snapshot

S2014

Vilhuber [2018]

S2011

Vilhuber and McKinney [2014]

S2008

McKinney and Vilhuber [2011]

S2004

McKinney and Vilhuber [2011]

1.9. Citing the Data, Sponsors, and Documentation

1.9.1. Acknowledging the Sponsors

The LEHD Snapshot draws on a data infrastructure that received substantial funding from a number of funding agencies and foundations. We strongly encourage researchers to acknowledge that funding in their paper’s “Acknowledgements” or data appendix. The following statement may be used:

This research uses data from the Census Bureau’s Longitudinal Employer-Household Dynamics program, which was partially supported by the following National Science Foundation grants SAS-9978093, SES-0339191, and ITR-0427889; National Institute on Aging grant AG018854; and grants from the Alfred P. Sloan Foundation.

1.9.2. Citing the Data

The data may be cited as follows:

Suggested citation

  • U.S. Census Bureau. LEHD Snapshot Release S2022. [Computer file], U.S. Census Bureau, Center for Economic Studies, Research Data Centers [distributor], Washington, DC, 2023.

1.9.3. Data Access Statement

Many journals have adopted stringent data availability requirements. Researchers will need to work with the Census Bureau in order to ensure availability of their programs and research extracts. A statement similar to the following has been successfully used for accepted papers:

The data used for this paper were prepared in the U.S. Census Bureau’s secure computing facilities under an authorized project using the Federal Statistical Research Data Center network. The exact analysis files have been fully archived so that the programming sequence submitted in compliance with the [JOURNAL]’s editorial policy can be run in its entirety, except for the component that extracts the analysis sample from the underlying confidential databases. I grant any researchers with appropriate Census-approved project permission to use my exact research files provided that those files were among the ones that they requested when the approval was obtained (a Census Bureau requirement). In compliance with the [JOURNAL]’s editorial policy, I am submitting the list of those files, and the last known location of the archive on the Census Bureau’s RDC network as of [date]. I authorize the editorial staff of the [JOURNAL] to release this list and my statement of cooperation to any researcher who requests it, as well as to the U.S. Census Bureau or any agency cooperating with the Census Bureau in supervising research that uses the restricted-access data that I have used.

1.9.4. Citing this Documentation

The latest version of this documentation is published in HTML format, and is publicly available at https://lehd.ces.census.gov/data/lehd-snapshot-doc/latest/. However, we suggest that you cite this documentation from its latest working paper publication as follows:

Suggested citation

  • Matthew Graham, Erika McEntarfer, Kevin McKinney, Stephen Tibbets, and Lee Tucker. LEHD Snapshot Documentation. Working Papers 22-51, Center for Economic Studies, U.S. Census Bureau, November 2022. URL: https://ideas.repec.org/p/cen/wpaper/22-51.html.

1

This research describes data from the Census Bureau’s Longitudinal Employer Household Dynamics Program, the original creation of which was partially supported by the following National Science Foundation (NSF) Grants SES-9978093, SES-0339191 and ITR0427889; National Institute on Aging Grant AG018854; and grants from the Alfred P. Sloan Foundation. Additionally, the current authors acknowledge the extensive contribution over the years by many, many individuals to the cumulative knowledge reflected in this document, too many to adequately enumerate here. This document has been reviewed to ensure that no confidential information is disclosed.