= LEHD Public Use  Schema V4.6.0 - File and Directory Naming Convention
<ces.qwi.feedback@census.gov>
21 January 2020
:ext-relative: {outfilesuffix}

( link:lehd_csv_naming.pdf[Printable version] )


Purpose
-------
The public-use data from the Longitudinal Employer-Household Dynamics Program, including the Quarterly Workforce Indicators (QWI)
and Job-to-Job Flows (J2J), are available for download with the following data schema.
These data are available as Comma-Separated Value (CSV) files through the LEHD website’s Data page at
https://lehd.ces.census.gov/data/ and through LED Extraction Tool at https://ledextract.ces.census.gov/.

This document describes the file and directory naming schema for LEHD-provided CSV files.

Schema for Data File Contents
-----------------------------

The contents (schema) for LEHD data files are described in  link:lehd_public_use_schema{ext-relative}[].
The contents (schema) for shapefiles are described in link:lehd_shapefiles{ext-relative}[].

Extends
-------
This version modifies a portion of the structure of the metadata. Many files compliant with LEHD or QWI Schema v4.2 or lower will also be compliant with this schema, but compatibility is not guaranteed.

Supersedes
----------
This version supersedes V4.4.0, for files released as of R2020Q1.


Basic Filename Schema
---------------------

All files are preceded by a file type definition, followed by additional information on the aggregation level of the file, or
some other identifier.

....................................
 [TYPE]_[DETAILS].[EXT]
....................................

( link:naming_convention.csv[] )


[width="90%",format="csv",delim=",",cols="^1,<3,<5",options="header"]
|===================================================
include::tmp_naming_convention.csv[]
|===================================================

=== QWIPU from the LED Extraction Tool
Files downloaded through the  LED Extraction Tool at https://ledextract.ces.census.gov/ follow the following naming convention:
....................................
 [type]_[id].[EXT]
....................................
where +[id]+ is the Request ID (a unique string of characters) generated every time Submit Request'' is clicked. The ID references each query submission made to the database.


=== [[versionqwi]]Metadata for QWI and J2J data files (version.txt)
Metadata accompanies the data files, identifying provenance, geographic and temporal coverage. These files follow the following naming convention:
....................................
version_[type].txt
....................................
where each name component is described in more detail <<components,below>>.


=== [[version_j2jod]]Metadata for J2JOD files
Because the origin-destination (J2JOD) data link two regions, we provide an auxiliary file with the time range for which cells containing data for each geographic pairing may appear in a data release. The reference region will always be either the origin or the destination.
These files follow the following naming convention:
....................................
j2jod_[geography]_avail.csv
....................................
where each name component is described in more detail <<components,below>>.


== [[paths]]Directory Paths
Downloadable data files are organized on the download server (at the time of this writing: https://lehd.ces.census.gov/data/) in directories organized as follows:

....................................
 SERVER/data/[PRODUCT]/[RELEASE]/[GEOHI][ /[TYPE] ]/{files}
....................................

where [<<products,PRODUCT>>], [<<release,RELEASE>>], [<<geohi,GEOHI>>], and [<<types,TYPE>>] are each described below. [<<types,TYPE>>] is optional.


== [[components]]Description of Naming Components

=== [[release]]Release

A *release* is defined by the calendar year quarter when data production occurs. It is thus generically constructed as

....................................
 R[YEAR]Q[QUARTER]
....................................

where +[YEAR]+ is the 4-digit year and +[QUARTER]+ the single-digit calendar year quarter (1-4).

=== [[types]]Types and [[products]]Products

( link:naming_type.csv[] )

[width="90%",format="csv",delim=";",cols="^1,<1,<3,<5,<3",options="header"]
|===================================================
include::naming_type.csv[]
|===================================================

=== [[geohi]]geohi
( link:naming_geohi.csv[] )

The geohi component in *filenames* is based on one of two possible code sets:

- the lower-case alphabetic FIPS or postal state code based on https://catalog.data.gov/dataset/fips-state-codes[FIPS PUB 5-2] (see also link:label_stusps.csv[] for upper-case variants).
- the numeric CBSA code corresponding to the metropolitan areas (see link:label_geography_metro.csv[])
- It is expanded to encompass additional codes denoting national coverage, or a collection of states.

For *directories*, files at the CBSA level are collected under a single *metro* directory.

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
type,Description
all,"Collection of all available states"
us,"National data (50 states + DC)"
metro,Indicates collection of CBSA-level files (*directory names only*)
+[st]+,Any legal 2-character lower-case state postal code
+[NNNNN]+,Any valid CBSA-derived code listed in link:label_geography_metro.csv[]
|===================================================

=== demo
( link:naming_demo.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_demo.csv[]
|===================================================

<<<


=== fas
( link:naming_fas.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_fas.csv[]
|===================================================

<<<


=== geocat
( link:naming_geocat.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_geocat.csv[]
|===================================================

<<<


=== indcat
( link:naming_indcat.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_indcat.csv[]
|===================================================

<<<


=== owncat
( link:naming_owncat.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_owncat.csv[]
|===================================================

<<<


=== sa
( link:naming_sa.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_sa.csv[]
|===================================================

<<<


=== ext
( link:naming_ext.csv[] )

[width="60%",format="csv",cols="^1,<4",options="header"]
|===================================================
include::naming_ext.csv[]
|===================================================

<<<


== [[changes]] Changes
For a description of how schema files are versioned, see link:../VERSIONING{ext-relative}[main directory].

=== This version (revisions)
- 2020-01-21: Initial release

=== Version 4.6.0 from 4.4.0
- Underlying geovintage updated to reflect 2019 census geography
- Updated shapefiles based on TIGER 2019
- Added documentation of new J2J Earnings indicators (coming soon in next release of J2J data)


[IMPORTANT]
.Important
==============================================
Some of the data products noted above do not exist yet.
==============================================

<<<
*******************
This revision: Tue Jan 21 20:03:22 UTC 2020
*******************