Introduction

The Longitudinal Employer-Household Dynamics (LEHD) program is part of the Center for Economic Studies at the U.S. Census Bureau. The LEHD program produces cost effective, public-use information combining federal, state and Census Bureau data on employers and employees under the Local Employment Dynamics (LED) Partnership. State and local authorities increasingly need detailed local information about their economies to make informed decisions. The LED Partnership works to fill critical data gaps and provide indicators needed by state and local authorities.

Under the LED Partnership, states agree to share Unemployment Insurance earnings data and the Quarterly Census of Employment and Wages (QCEW) data with the Census Bureau. The LEHD program combines these administrative data, additional administrative data and data from censuses and surveys. From these data, the program creates statistics on employment, earnings, and job flows at detailed levels of geography and industry and for different demographic groups. In addition, the LEHD program uses these data to create partially synthetic data on workers’ residential patterns.

LEHD makes available several data products that may be used to research and characterize workforce dynamics for specific groups. These data products include online applications, public-use data, and restricted-use microdata. The purpose of this documentation is to provide open-source code examples of the public-use data that may be helpful for users with analysis needs beyond those allowed for in LEHD’s online applications. For more information on the LEHD data applications, see the LEHD website.

Note

These open source code samples are intended for users who have a working knowledge of either Python or R, including basic proficiency in installing packages, managing environments, and running scripts in their chosen language. Some examples may also require familiarity with geographic data and GIS concepts, such as shapefiles, projections, and spatial joins.

These samples are provided for educational and illustrative purposes only and are not production-ready tools. Users should carefully review and adapt the code to fit their specific use cases and environments.

Data Products

This documentation is organized by data product. For each data product, examples of both simple tasks (reading in the data) and more complex tasks (such as geographic merges) are provided. Each data product is listed below with a brief description and additional links:

  • LEHD Origin-Destination Employment Statistics (LODES): LODES provides detailed spatial distributions of workers and jobs across the United States as well as data on age, earnings, industry distributions, race, ethnicity, and sex.

  • Post-Secondary Employment Outcomes (PSEO): PSEO is a set of statistics on the earnings and employment outcomes of graduates of select post-secondary institutions in the United States, and is constructed using LEHD data. Earnings Outcomes reports earnings by institution, degree field, degree level and graduation cohort for 1, 5 and 10 years after graduation. Employment Flows tabulations provide the destination industry and geography of employment for graduates of an institution by degree level, degree field, and graduation cohort, for one, five, and 10 years after graduation.

  • QWI [Code Samples Forthcoming]: QWI are a set of economic indicators including employment, job creation, earnings, and other measures of employment flows. The QWI are reported based on detailed firm characteristics (geography, industry, age, size) and worker demographics information (sex, age, education, race, ethnicity) and are available tabulated to national*, state, metropolitan/micropolitan areas, county, and Workforce Investment Board (WIB) areas.

  • J2J [Code Samples Forthcoming]: J2J is a set of statistics on job mobility in the United States. J2J include statistics on: (1) the job-to-job transition rate, (2) hires and separations to and from employment, (3) earnings changes due to job change, and (4) characteristics of origin and destination jobs for job-to-job transitions.

  • VEO [Code Samples Forthcoming]: VEO are experimental statistics on veterans’ labor market outcomes one, five, and 10 years after discharge, by military occupation, rank, demographics (age, sex, race, ethnicity, education), industry and geography of employment. These statistics are generated by linking veteran records provided by the Department of Defense to national administrative data on jobs at the U.S. Census Bureau.

Getting Started

To get started using the data (or the code samples in this documentation), you need either Python or R. The code samples in this documentation were validated and run on recent versions of Python and R, on a computer with 16 GB of RAM across platforms. The package dependencies for the code samples can be found here for Python and here for R. For detailed information on the packages used and install instructions, see the Software/Packages and Install Instructions section in the Appendix.