1. Introduction
The Longitudinal Employer-Household Dynamics (LEHD) data at the U.S. Census Bureau is a quarterly database of linked employer-employee data covering over 95% of employment in the United States. 1 The LEHD data are generated by merging previously collected survey and administrative data on jobs, businesses, and workers. By integrating administrative data with existing census and surveys, a national longitudinal jobs database for the U.S. is generated at very low cost and with no additional respondent burden. The LEHD microdata is available to approved researchers in the Federal Statistical Research Data Center (FSRDC) network. LEHD data is also available to the public in tabular form in the Census Bureau’s Quarterly Workforce Indicators (QWI), LEHD Origin-Destination Employment Statistics (LODES), Job-to-Job Flows (J2J), and the Post-Secondary Employment Outcomes (PSEO) data.
In order to facilitate researcher use of the microdata, LEHD infrastructure files are periodically collected into a standardized form known as the LEHD Snapshot, which is then released into the FSRDC network. The purpose of this documentation and codebook is to provide a limited overview of the components of the LEHD Snapshot files, as well as detailed codebooks for researcher use. For an overview of the LEHD infrastructure, see Abowd et al. [2009]. This documentation is not intended to provide details of specific LEHD data products, or of the disclosure limitation measures used to produce those products. For more information on specific LEHD data products and their disclosure limitation techniques, see Abowd et al. [2006], Haney et al. [2017], Machanavajjhala et al. [2008] .
The LEHD Snapshot consists almost exclusively of restricted-use administrative microdata. Therefore, as with all projects using restricted data, any projects using the LEHD Snapshot must undergo standard review and approval processes. Researchers are required to demonstrate that they cannot conduct their research using public-use data sources, and they must identify the ways in which their project will benefit both the objectives of the U.S. Census Bureau and the broader research community. Before requesting access to the LEHD Snapshot, researchers are encouraged to examine the public data offerings of the LEHD program at https://lehd.ces.census.gov/data/, as well as the broader public offerings at https://data.census.gov/. For more details on the FSRDC program and the requirements for project approval, see the FSRDC about page as well as the approval guide below.
This documentation is current as of LEHD Snapshot release S2022_R2023Q4, which contains available data through 2022. Features may change across Snapshot releases. Updated versions of the Snapshot documentation are published in HTML format, and the most recent Snapshot is publicly available at https://lehd.ces.census.gov/data/lehd-snapshot-doc/latest/. References to documentation for earlier Snapshot releases can be found below. For details on how to cite the data and this documentation, see below.
1.1. Earnings Coverage
Covered private-industry employment in the LEHD data includes most corporate officials, all executives, all supervisory personnel, all professionals, all clerical workers, many farmworkers, all wage earners, all pieceworkers, and all part-time workers. Workers on paid sick leave, paid holiday, paid vacation, and the like are also covered. Workers on the payroll of more than one firm during the period are counted by each employer that is subject to UI, as long as those workers satisfy the preceding definition of employment. Workers have UI wages filed in every quarter they are covered, even though their wages may not be subject to UI tax in the latter months of the year.
Notable exclusions from UI coverage among private sector employers are independent contractors, the unincorporated self-employed, railroad workers covered by the railroad unemployment insurance system, some family employees of family-owned businesses, certain farm workers, students working for universities under certain cooperative programs, salespersons primarily paid on commission, and workers of some non-profits. States have some leeway in designating coverage, for a complete list, see the coverage section of the most recent Comparison of State UI Laws.
Covered employment in the LEHD data includes most employees of state and local governments with the exception of elected officials, members of a legislative body or members of the judiciary, and some emergency employees.
Federal government workers are not covered by state UI and do not appear in UI wage record data.
1.2. LEHD Confidential Microdata: Job, Employer, and Person-Level Files
The LEHD Snapshot is organized into sets of tables providing job-level, employer-level, and person-level data. The scope of each table is either national (all states) or a single state. National tables will have a US in the table name, while state tables will have the two-character state postal code. For purposes of this documentation, the postal code will be replaced with ZZ. Tables that are national in scope will not have the complete panel of states in all quarters because of limited availability of state job data both at the beginning and end of the time series. For more details, see Availability of Data.
1.2.1. Job-Level Files
Job-level data provide earnings for jobs, which is a link between a worker and a firm. The employment coverage in the LEHD job-level microdata files described below is UI-covered employment only (see Earnings Coverage for details on covered employment and wages). Researchers interested in federal jobs should request the (OPM) microdata separately as part of their research request.
Table |
Description |
Scope |
Key |
---|---|---|---|
Indicates data availability ranges for all states. |
National |
State |
|
Indicates whether or not a PIK had any earnings reported in any state in the LEHD system. |
National |
PIK YEAR |
|
Quarterly earnings for each job in a state as reported by the employer to the state’s UI system. |
State |
PIK SEIN SEINUNIT YEAR |
|
Wide version of earnings history that contains imputed establishment (SEINUNITs) for multi-unit firms, and a job-level ID useful for tracking job spells across employer identifier changes. |
State |
PIK SEIN SPELL_U2W |
1.2.2. Employer-Level Files
Employer-level data contain characteristics of firms, as well as summary measures calculated from job data.
Table |
Description |
Scope |
Key |
---|---|---|---|
Sourced from the SEINUNIT file, this file rolls up employer characteristics to the SEIN level. |
State |
SEIN YEAR QUARTER |
|
An establishment level file with industry, size, and location. |
State |
SEIN YEAR QUARTER SEINUNIT |
|
Contains the national firm age and size data for employers in the ECF. This file requires IRS approval. |
State |
SEIN YEAR QUARTER |
|
The microdata version of the public use QWI data, containing establishment-level hires, separations, net job growth, and average earnings. |
State |
SEIN YEAR QUARTER SEINUNIT |
|
Flows of separations and hires between employers (SEINs) used to identify successor-predecessor relationships. |
State |
SEIN SEIN_SUCC QTIME |
1.2.3. Person-Level Files
Person-level tables provide demographic characteristics of individuals and observed residence geography.
Table |
Description |
Scope |
Key |
---|---|---|---|
Demographics, place of birth, and education of workers. |
National |
PIK |
|
Multiply imputed date of birth, sex, and place of birth variables to complete missing information. |
National |
PIK |
|
Multiply imputed education variables to complete missing information. |
National |
PIK |
|
Multiply imputed race and ethnicity variables to complete missing information. |
National |
PIK |
|
Residence geography prior to 2012 for individuals found in the wage data. |
National |
PIK ADDRESS_YEAR |
|
Residence geography from 2012 forward for individuals found in the wage data. |
National |
PIK ADDRESS_YEAR |
1.3. Linking the job, employer, and person data
To link worker demographics to the jobs data, link the EHF or JHF to the ICF US via the PIK.
To attach the characteristics of the employer (SEIN) to the job, link the EHF or JHF to the ECF SEIN using the SEIN.
To attach the characteristics of the establishment to the job, use the first implicate of the worker-to-establishment imputation (SEINUNIT1) on the JHF to link to the ECF SEINUNIT table.
All implicates (SEINUNIT1-SEINUNIT10) can be used to produce standard errors that account for imputation variability [McKinney et al., 2021].
Because of longitudinal restrictions on establishment imputation, jobs may be active at establishments in quarters that the establishment does not appear on the ECF. See Adding Establishment Characteristics for more information.
1.4. Approvals Needed for FSRDC Research
In addition to Census data, the administrative records that combine to form the LEHD infrastructure derive from several different federal and state agencies. Approval of a proposed FSRDC project by other agencies may be required, as follows:
Requests for state-level tables will follow the procedure specified by the Memorandum of Understanding (MOU) with that state. Most commonly, the state agency is given the opportunity to approve proposed projects.
Person-level demographic characteristics (ICF) require the approval of the Social Security Administration (SSA).
Employer-level Title 26 (T26) data requires the additional approval of the Internal Revenue Service (IRS). This includes the Federal EIN, as well as age and size of the national firm.
The codebook for each table will indicate which approvals are required.
The job-level tables largely consist of state-provided data and so go through the LEHD state data owner approval process. The indicator and availability files do not require state approval. There is no federal tax information or Social Security administrative data in any of the job-level tables, so no additional IRS or SSA approvals are required.
1.5. Snapshot Packages in the FSRDC
LEHD Snapshot tables are delivered in packages according to the level of data in the tables. Tables that require additional approvals are provided in separate packages.
Package |
Tables |
---|---|
Jobs Data |
EHF_ZZ JHF_ZZ EHF_US_INDICATORS EHF_ALL_AVAILABILITY |
Employer Data (non-T26) |
ECF_ZZ_SEIN ECF_ZZ_SEINUNIT SPF_ZZ QWI_ZZ_SEINUNIT |
Employer Data (T26) |
ECF_ZZ_SEIN_T26 |
Person Demographics (Workers Only) |
ICF_US ICF_US_IMPLICATES_AGE_SEX_POB ICF_US_IMPLICATES_RACE_ETHNICITY ICF_US_IMPLICATES_EDUCATION |
Person Residence (Workers Only) |
ICF_US_RESIDENCE_CPR ICF_US_RESIDENCE_RCF |
1.6. Quarter Indexing - qtime
On several LEHD tables, the year and quarter is replaced by or supplemented with an index for ease of counting quarters in longitudinal references. This is initialized with 1985Q1 being set to 1. The following SAS formulae can be used to switch back and forth:
qtime = (year-1985) * 4 + quarter
year = int((qtime-1)/4) + 1985
quarter = mod(qtime-1, 4) + 1
For a table of all possible qtime values with year/quarter refererences see codebook.
1.7. Availability of Data
The longitudinal availablity of data in the LEHD Snapshot will vary by type of data and the data source, as described in this section.
1.7.1. Job and Employer Data (State Provided)
The following table indicates the availability of LEHD data by state for the EHF, ECF, and QWI tables. This includes all states that are or have been members of the LED partnership. In general, LEHD snapshot release S2022_R2023Q4 contains all available data through 2022.
The EHF range represents all of the quarters of UI wage data that have been received.
The QWI range may begin later than the EHF range if certain data quarters do not meet quality standards for publication in LED data products.
Earnings quarters on the JHF are restricted to the QWI range.
The ECF range may start before the EHF range if employer-level data submissions (QCEW) are available prior to the beginning of the UI wage data series.
End dates will usually be the same for all tables within a state. States that are not active partners may not be available for the latest quarters.
State |
ECF Start |
EHF Start |
QWI/JHF Start |
ECF End |
EHF End |
QWI/JHF End |
---|---|---|---|---|---|---|
Alabama |
2001Q1 |
2001Q1 |
2001Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Alaska |
1990Q1 |
1990Q1 |
2000Q1 |
2016Q2 |
2016Q2 |
2016Q2 |
Arizona |
1992Q1 |
1992Q1 |
2004Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Arkansas |
2002Q3 |
2002Q3 |
2002Q3 |
2018Q2 |
2018Q2 |
2018Q2 |
California |
1991Q1 |
1991Q3 |
1991Q3 |
2023Q1 |
2023Q1 |
2023Q1 |
Colorado |
1990Q1 |
1990Q1 |
1993Q2 |
2023Q1 |
2023Q1 |
2023Q1 |
Connecticut |
1996Q1 |
1996Q1 |
1996Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Delaware |
1997Q1 |
1998Q3 |
1998Q3 |
2023Q1 |
2023Q1 |
2023Q1 |
District of Columbia |
2000Q4 |
2002Q2 |
2005Q2 |
2023Q1 |
2023Q1 |
2023Q1 |
Florida |
1989Q1 |
1992Q4 |
1997Q4 |
2023Q1 |
2023Q1 |
2023Q1 |
Georgia |
1994Q1 |
1994Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Hawaii |
1995Q4 |
1995Q4 |
1995Q4 |
2023Q1 |
2023Q1 |
2023Q1 |
Idaho |
1990Q1 |
1990Q1 |
1991Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Illinois |
1990Q1 |
1990Q1 |
1993Q2 |
2023Q1 |
2023Q1 |
2023Q1 |
Indiana |
1990Q1 |
1990Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Iowa |
1990Q1 |
1998Q4 |
1998Q4 |
2023Q1 |
2023Q1 |
2023Q1 |
Kansas |
1990Q1 |
1990Q1 |
1993Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Kentucky |
1996Q4 |
1996Q4 |
2001Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Louisiana |
1990Q1 |
1990Q1 |
1995Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Maine |
1996Q1 |
1996Q1 |
1996Q2 |
2023Q1 |
2023Q1 |
2023Q1 |
Maryland |
1985Q4 |
1985Q4 |
1990Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Massachusetts |
2002Q1 |
2002Q1 |
2010Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Michigan |
1998Q1 |
1998Q1 |
2000Q3 |
2021Q4 |
2021Q4 |
2021Q4 |
Minnesota |
1994Q3 |
1994Q3 |
1994Q3 |
2023Q1 |
2023Q1 |
2023Q1 |
Mississippi |
2003Q3 |
2003Q3 |
2003Q3 |
2018Q2 |
2018Q2 |
2018Q2 |
Missouri |
1990Q1 |
1990Q1 |
1995Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Montana |
1993Q1 |
1993Q1 |
1993Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Nebraska |
1999Q1 |
1999Q1 |
1999Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Nevada |
1998Q1 |
1998Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
New Hampshire |
2003Q1 |
2003Q1 |
2003Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
New Jersey |
1995Q1 |
1996Q1 |
1996Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
New Mexico |
1990Q1 |
1995Q3 |
1995Q3 |
2023Q1 |
2023Q1 |
2023Q1 |
New York |
1990Q1 |
1995Q1 |
2000Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
North Carolina |
1990Q1 |
1991Q1 |
1992Q4 |
2022Q3 |
2022Q3 |
2022Q3 |
North Dakota |
1998Q1 |
1998Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Ohio |
1994Q1 |
1994Q1 |
2000Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Oklahoma |
1999Q1 |
2000Q1 |
2000Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Oregon |
1990Q1 |
1991Q1 |
1991Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Pennsylvania |
1991Q1 |
1991Q1 |
1997Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Puerto Rico |
2001Q1 |
2001Q1 |
2010Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Rhode Island |
1990Q1 |
1995Q1 |
1995Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
South Carolina |
1998Q1 |
1998Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
South Dakota |
1994Q1 |
1994Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Tennessee |
1998Q1 |
1998Q1 |
1998Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Texas |
1990Q1 |
1995Q1 |
1995Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Utah |
1990Q1 |
1999Q1 |
1999Q3 |
2023Q1 |
2023Q1 |
2023Q1 |
Vermont |
2000Q1 |
2000Q1 |
2000Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Virginia |
1995Q3 |
1998Q1 |
1998Q3 |
2023Q1 |
2023Q1 |
2023Q1 |
Washington |
1990Q1 |
1990Q1 |
1990Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
West Virginia |
1990Q1 |
1997Q1 |
1997Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Wisconsin |
1990Q1 |
1990Q1 |
1990Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
Wyoming |
1992Q1 |
1992Q1 |
2001Q1 |
2023Q1 |
2023Q1 |
2023Q1 |
1.7.2. Residence Data
Annual residence data is provided for workers who appear in the Job data. Residences are sourced from the Composite Person Record (CPR) tables or the Residential Candidates File (RCF). The year ranges for which each data source is available follow.
Source |
Start Year |
End Year |
---|---|---|
CPR |
1999 |
2011 |
RCF |
2012 |
2021 |
1.8. Previous Versions of Snapshot Documentation
Links to previous versions of documentation can be accessed in the table below. Please note that documentation for Snapshot versions prior to S2021 are in working paper format.
Version |
Link |
---|---|
S2022_R2023Q4 |
|
S2021_R2022Q4 |
|
S2014 |
Vilhuber [2018] |
S2011 |
Vilhuber and McKinney [2014] |
S2008 |
McKinney and Vilhuber [2011] |
S2004 |
McKinney and Vilhuber [2011] |
1.9. Citing the Data, Sponsors, and Documentation
1.9.1. Acknowledging the Sponsors
The LEHD Snapshot draws on a data infrastructure that received substantial funding from a number of funding agencies and foundations. We strongly encourage researchers to acknowledge that funding in their paper’s “Acknowledgements” or data appendix. The following statement may be used:
This research uses data from the Census Bureau’s Longitudinal Employer-Household Dynamics program, which was partially supported by the following National Science Foundation grants SAS-9978093, SES-0339191, and ITR-0427889; National Institute on Aging grant AG018854; and grants from the Alfred P. Sloan Foundation.
1.9.2. Citing the Data
The data may be cited as follows:
Suggested citation
U.S. Census Bureau. LEHD Snapshot Release S2022. [Computer file], U.S. Census Bureau, Center for Economic Studies, Research Data Centers [distributor], Washington, DC, 2023.
1.9.3. Data Access Statement
Many journals have adopted stringent data availability requirements. Researchers will need to work with the Census Bureau in order to ensure availability of their programs and research extracts. A statement similar to the following has been successfully used for accepted papers:
The data used for this paper were prepared in the U.S. Census Bureau’s secure computing facilities under an authorized project using the Federal Statistical Research Data Center network. The exact analysis files have been fully archived so that the programming sequence submitted in compliance with the [JOURNAL]’s editorial policy can be run in its entirety, except for the component that extracts the analysis sample from the underlying confidential databases. I grant any researchers with appropriate Census-approved project permission to use my exact research files provided that those files were among the ones that they requested when the approval was obtained (a Census Bureau requirement). In compliance with the [JOURNAL]’s editorial policy, I am submitting the list of those files, and the last known location of the archive on the Census Bureau’s RDC network as of [date]. I authorize the editorial staff of the [JOURNAL] to release this list and my statement of cooperation to any researcher who requests it, as well as to the U.S. Census Bureau or any agency cooperating with the Census Bureau in supervising research that uses the restricted-access data that I have used.
1.9.4. Citing this Documentation
The latest version of this documentation is published in HTML format, and is publicly available at https://lehd.ces.census.gov/data/lehd-snapshot-doc/latest/. However, we suggest that you cite this documentation from its latest working paper publication as follows:
Suggested citation
Matthew Graham, Erika McEntarfer, Kevin McKinney, Stephen Tibbets, and Lee Tucker. LEHD Snapshot Documentation. Working Papers 22-51, Center for Economic Studies, U.S. Census Bureau, November 2022. URL: https://ideas.repec.org/p/cen/wpaper/22-51.html.
- 1
This research describes data from the Census Bureau’s Longitudinal Employer Household Dynamics Program, the original creation of which was partially supported by the following National Science Foundation (NSF) Grants SES-9978093, SES-0339191 and ITR0427889; National Institute on Aging Grant AG018854; and grants from the Alfred P. Sloan Foundation. Additionally, the current authors acknowledge the extensive contribution over the years by many, many individuals to the cumulative knowledge reflected in this document, too many to adequately enumerate here. This document has been reviewed to ensure that no confidential information is disclosed.