4.2. Individual Characteristics Files - Residential Geography
4.2.1. Overview
This file contains annual records with residential geography for individuals found in the wage data (workers). All records from the source tables are kept for workers, though workers may not have residential geography or jobs in every year. Information is available beginning in 1999. The target reference date for the residence is considered to be April 1 of the year.
CPR-Sourced Residence (ICF_US_RESIDENCE_CPR)
For 1999-2010, residential geography is sourced from the Composite Person Record (CPR) file. Though the formal CPR was not available in 2011, locations for that year are included in this table using an alternate data source.
Scope: National
RCF-Sourced Residence (ICF_US_RESIDENCE_RCF)
For 2012 onward, residential geography is sourced from the Residence Candidates File (RCF).
Scope: National
4.2.2. User Guidance
Composite Person Record (CPR)
The CPR was the original source of residence data used in LEHD. The CPR file series, which begins in 1999, contains fields that provided a linkage between a unique person record and a place of residence for each year. The CPR was produced until the file was discontinued in 2011 (after the production of data year 2010). In the following year, the MAF-ARF (Master Address File-Auxiliary Reference File) was used as a replacement for the CPR for data year 2011. The MAF-ARF was found to differ from the CPR in a number of ways, including a difference in coverage and a lack of deduplication among PIKs. LEHD was able to produce a deduplicated version of the MAF-ARF by defining some very basic business rules. Some additional history of the CPR and MAF-ARF can be found in Graham et al. [2017].
The county codes on the CPR (five-digit state+county FIPS code) are contemporaneous to the data year. The set of valid county codes and the boundaries of the counties can change over time. For more information on major county changes over time, see the Census Bureau’s County Changes page.
Residence Candidates File (RCF)
The RCF combines a set of federal administrative source files containing residence information for a person at a time into a file with preference weights for each person/location by reference period and with no remaining source information. The complete methodology is described in Graham et al. [2017]. This extract contains only the most preferred location for each PIK from the RCF for each year, so that there is only one record per PIK-year.
Geography on the RCF is provided via the MAFID, mapped to current geography via the Block Map File (BMF). The BMF is a derivative of the Geographic Reference File (GRF-C) with some added value used in LEHD Processing. This geography will be consistent with locations reported on the ECF. Not all MAFIDs provide tract level precision, and some cannot be mapped into current geography. The flag_rcf variable provides the results of the geography assignment.
If additional geographical information is needed (beyond the state, county, and census tract provided in this extract), researchers can lookup a records MAFID in the MAF/TIGER Extract (MAFX). The MAFX is not part of the LEHD Snapshot and must be requested separately for projects. The most recent MAFX will contain the vast majority of MAFIDs found in this extract; however, in some cases researchers may need to find individual MAFIDs in older MAFX files.
Connecticut County Changes - 2023
In Census geography, the historic counties in Connecticut were replaced with alternative county-equivalent regions called Councils of Governments (COGs), and this was implemented in the LEHD infrastructure with the release of 2023 data. This change was made because the historic counties no longer have a functional purpose as an administrative level of government, and COGs are now used as regional planning areas. The 8 historic counties were replaced with 9 new county equivalents, each assembled from towns. Researchers wishing to assign establishments to the historic county can use MAFX and the Census Geographic Reference File (GRF-C) or TIGER data to map the into older geography. For more information on this change, see Census geography technical documentation on county changes.
4.2.3. Codebook: The ICF_US_RESIDENCE_CPR File
Table Metadata for Residence Geography, 1999-2011 (ICF_US_RESIDENCE_CPR)
State Approval Required | IRS Approval Required | SSA Approval Required | |
Access Requirements |
- Description
Residence geography prior to 2012 for individuals found in the wage data.
- Scope
- Key
- Sort Order
- File Format
SAS Data Table
- Download Codebook
Variable Information
Variable Name | Type | Length | Description |
PIK | char | 9 | PIK - Protected Identification Key |
ADDRESS_YEAR | num | 4 | Year address is found on source data |
GEOCODEFULL | char | 15 | FIPS State(2) ||FIPS County(3) ||Tract(6)||Block(4) |
LATITUDE_LIVE | num | 8 | Latitude of residence, 6 implied decimal places |
LONGITUDE_LIVE | num | 8 | Longitude of residence, 6 implied decimal places |
num | 3 | Flag quality of latitude/longitude of residence (See details below) |
- Description
Flag quality of latitude/longitude of residence
- Codebook
Value Label -1 Lat/Long quality not available (2011 data) 1 Location interpolated from house number 2 House number outside of road segment address range; snapped to end of range 3 Complex house number: location interpolated from house number 4 Missing house number: location interpolated from other number in address 5 Used midpoint of road segment 6 Location based upon match to ZCTA 7 Location based upon match to county
4.2.4. Codebook: The ICF_US_RESIDENCE_RCF File
Table Metadata for Residence Geography, 2012 forward (ICF_US_RESIDENCE_RCF)
State Approval Required | IRS Approval Required | SSA Approval Required | |
Access Requirements |
- Description
Residence geography from 2012 forward for individuals found in the wage data.
- Scope
- Key
- Sort Order
- File Format
SAS Data Table
- Download Codebook
Variable Information
Variable Name | Type | Length | Description |
PIK | char | 9 | Protected Indentification Key PIK |
ADDRESS_YEAR | num | 4 | Year address is found on source data |
num | 8 | ||
TRACT | num | 8 | |
MAFID | char | 9 | MAFID from best source |
LATITUDE_LIVE | num | 8 | |
LONGITUDE_LIVE | num | 8 | |
num | 8 | Flag for geographic precision (See details below) |
Details for variable FLAG_RCF on ICF_US_RESIDENCE_RCF
- Description
Flag for geographic precision
- Codebook
Value Label 11 State/county and tract from BMF in current geography 12 State/county from current MAF, no precise GEOID 13 MAF record in earlier tab geography, county from BMF valid in current geography 14 MAF record in earlier tab geography, county from BMF not valid in current geography 15 MAF record in earlier tab geography, no GEOID, MAF county valid in current geography 16 MAF record in earlier tab geography, no GEOID, MAF county not valid in current geography