CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'record'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Internal Revenue Service - 24

Social Security Administration - 22

Protected Identification Key - 22

American Community Survey - 21

Center for Economic Studies - 18

Service Annual Survey - 17

Social Security Number - 15

Person Validation System - 14

National Science Foundation - 13

2010 Census - 13

Census Bureau Disclosure Review Board - 12

North American Industry Classification System - 12

Personally Identifiable Information - 11

Research Data Center - 11

Person Identification Validation System - 10

Social Security - 10

Longitudinal Business Database - 10

Indian Health Service - 9

Master Address File - 9

Standard Industrial Classification - 9

Administrative Records - 9

Center for Administrative Records Research and Applications - 9

Longitudinal Employer Household Dynamics - 8

Current Population Survey - 8

County Business Patterns - 8

Employer Identification Numbers - 8

Business Register - 8

Decennial Census - 7

Department of Housing and Urban Development - 7

Indian Housing Information Center - 7

Housing and Urban Development - 7

Some Other Race - 7

Bureau of Labor Statistics - 7

Economic Census - 7

Federal Statistical Research Data Center - 7

Disclosure Review Board - 6

Quarterly Workforce Indicators - 6

Individual Taxpayer Identification Numbers - 6

Survey of Income and Program Participation - 6

Business Dynamics Statistics - 6

SSA Numident - 6

National Opinion Research Center - 6

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 5

Computer Assisted Personal Interview - 5

CATI - 5

Quarterly Census of Employment and Wages - 5

Census Bureau Person Identification Validation System - 5

Census Numident - 5

Census Bureau Business Register - 5

Annual Survey of Manufactures - 5

MAFID - 5

Cornell University - 5

Medicaid Services - 5

Postal Service - 4

Census Bureau Master Address File - 4

Standard Statistical Establishment List - 4

Company Organization Survey - 4

Centers for Medicare - 4

Chicago Census Research Data Center - 4

Unemployment Insurance - 4

Center for Administrative Records Research - 4

Census of Manufactures - 4

PIKed - 4

Sloan Foundation - 3

Supplemental Nutrition Assistance Program - 3

Census Edited File - 3

Census Household Composition Key - 3

University of Chicago - 3

National Center for Health Statistics - 3

Office of Management and Budget - 3

1940 Census - 3

Department of Economics - 3

Alfred P Sloan Foundation - 3

University of Michigan - 3

COVID-19 - 3

Metropolitan Statistical Area - 3

Longitudinal Research Database - 3

Minnesota Population Center - 3

Local Employment Dynamics - 3

Duke University - 3

data - 36

survey - 26

datasets - 24

respondent - 21

microdata - 20

census bureau - 20

census data - 17

matching - 17

data census - 16

agency - 15

database - 15

report - 13

population - 12

statistical - 12

imputation - 11

records census - 10

irs - 10

census records - 10

linkage - 10

matched - 10

identifier - 9

disclosure - 8

federal - 8

use census - 8

census use - 8

census research - 8

ethnicity - 7

hispanic - 7

estimating - 7

coverage - 7

ssa - 7

department - 7

quarterly - 7

information - 7

confidentiality - 6

privacy - 6

publicly - 6

employed - 6

census survey - 6

citizen - 6

business data - 6

aggregate - 6

sector - 6

firms census - 6

census file - 6

1040 - 5

enrollment - 5

employee - 5

filing - 5

census employment - 5

census linked - 5

residence - 5

payroll - 5

enterprise - 5

longitudinal - 5

analysis - 5

associate - 5

survey data - 5

public - 4

minority - 4

ethnic - 4

job - 4

incorporated - 4

employment data - 4

workforce - 4

race - 4

sampling - 4

discrepancy - 4

race census - 4

linked census - 4

resident - 4

census responses - 4

census 2020 - 4

sample - 4

reporting - 4

employ - 4

employment statistics - 4

researcher - 4

research - 4

2010 census - 4

statistical disclosure - 4

model - 4

industrial - 4

statistical agencies - 4

surveys censuses - 3

tenure - 3

assessed - 3

native - 3

state - 3

migration - 3

migrant - 3

medicare - 3

medicaid - 3

recession - 3

establishments data - 3

manufacturing - 3

statistician - 3

financial - 3

demography - 3

inference - 3

ancestry - 3

econometric - 3

earnings - 3

Viewing papers 31 through 40 of 51


  • Working Paper

    RECOVERING THE ITEM-LEVEL EDIT AND IMPUTATION FLAGS IN THE 1977-1997 CENSUSES OF MANUFACTURES

    September 2014

    Authors: T. Kirk White

    Working Paper Number:

    CES-14-37

    As part of processing the Census of Manufactures, the Census Bureau edits some data items and imputes for missing data and some data that is deemed erroneous. Until recently it was difficult for researchers using the plant-level microdata to determine which data items were changed or imputed during the editing and imputation process, because the edit/imputation processing flags were not available to researchers. This paper describes the process of reconstructing the edit/imputation flags for variables in the 1977, 1982, 1987, 1992, and 1997 Censuses of Manufactures using recently recovered Census Bureau files. Thepaper also reports summary statistics for the percentage of cases that are imputed for key variables. Excluding plants with fewer than 5 employees, imputation rates for several key variables range from 8% to 54% for the manufacturing sector as a whole, and from 1% to 72% at the 2-digit SIC industry level.
    View Full Paper PDF
  • Working Paper

    HIRES, SEPARATIONS, AND THE JOB TENURE DISTRIBUTION IN ADMINISTRATIVE EARNINGS RECORDS

    September 2014

    Working Paper Number:

    CES-14-29

    Statistics on hires, separations, and job tenure have historically been tabulated from survey data. In recent years, these statistics are increasingly being produced from administrative records. In this paper, we discuss the calculation of hires, separations, and job tenure from quarterly administrative records, and we present these labor market statistics calculated from the U.S. Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) program. We pay special attention to a phenomenon that survey data is ill-suited to analyze: single quarter jobs, which we define as jobs in which the hire and separation occur in the same quarter. We explore the trends of hires, separations, tenure, and single quarter jobs in the United States for the years 1998-2010. We discuss issues associated with creating these statistics from quarterly earnings records, and we identify the challenges that remain.
    View Full Paper PDF
  • Working Paper

    Creating Linked Historical Data: An Assessment of the Census Bureau's Ability to Assign Protected Identification Keys to the 1960 Census

    September 2014

    Working Paper Number:

    carra-2014-12

    In order to study social phenomena over the course of the 20th century, the Census Bureau is investigating the feasibility of digitizing historical census records and linking them to contemporary data. However, historical censuses have limited personally identifiable information available to match on. In this paper, I discuss the problems associated with matching older censuses to contemporary data files, and I describe the matching process used to match a small sample of the 1960 census to the Social Security Administration Numeric Identification System.
    View Full Paper PDF
  • Working Paper

    Person Matching in Historical Files using the Census Bureau's Person Validation System

    September 2014

    Working Paper Number:

    carra-2014-11

    The recent release of the 1940 Census manuscripts enables the creation of longitudinal data spanning the whole of the twentieth century. Linked historical and contemporary data would allow unprecedented analyses of the causes and consequences of health, demographic, and economic change. The Census Bureau is uniquely equipped to provide high quality linkages of person records across datasets. This paper summarizes the linkage techniques employed by the Census Bureau and discusses utilization of these techniques to append protected identification keys to the 1940 Census.
    View Full Paper PDF
  • Working Paper

    2010 American Community Survey Match Study

    July 2014

    Working Paper Number:

    carra-2014-03

    Using administrative records data from federal government agencies and commercial sources, the 2010 ACS Match Study measures administrative records coverage of 2010 ACS addresses, persons, and persons at addresses at different levels of geography as well as by demographic characteristics and response mode. The 2010 ACS Match Study represents a continuation of the research undertaken in the 2010 Census Match Study, the first national-level evaluation of administrative records data coverage. Preliminary results indicate that administrative records provide substantial coverage for addresses and persons in the 2010 ACS (92.7 and 92.1 percent respectively), and less extensive though substantial coverage, for person-address pairs (74.3 percent). In addition, some variation in address, person and/or person-address coverage is found across demographic and response mode groups. This research informs future uses of administrative records in survey and decennial census operations to address the increasing costs of data collection and declining response rates.
    View Full Paper PDF
  • Working Paper

    Estimating Record Linkage False Match Rate for the Person Identification Validation System

    July 2014

    Working Paper Number:

    carra-2014-02

    The Census Bureau Person Identification Validation System (PVS) assigns unique person identifiers to federal, commercial, census, and survey data to facilitate linkages across files. PVS uses probabilistic matching to assign a unique Census Bureau identifier for each person. This paper presents a method to measure the false match rate in PVS following the approach of Belin and Rubin (1995). The Belin and Rubin methodology requires truth data to estimate a mixture model. The parameters from the mixture model are used to obtain point estimates of the false match rate for each of the PVS search modules. The truth data requirement is satisfied by the unique access the Census Bureau has to high quality name, date of birth, address and Social Security (SSN) data. Truth data are quickly created for the Belin and Rubin model and do not involve a clerical review process. These truth data are used to create estimates for the Belin and Rubin parameters, making the approach more feasible. Both observed and modeled false match rates are computed for all search modules in federal administrative records data and commercial data.
    View Full Paper PDF
  • Working Paper

    The Person Identification Validation System (PVS): Applying the Center for Administrative Records Research and Applications' (CARRA) Record Linkage Software

    July 2014

    Working Paper Number:

    carra-2014-01

    The Census Bureau's Person Identification Validation System (PVS) assigns unique person identifiers to federal, commercial, census, and survey data to facilitate linkages across and within files. PVS uses probabilistic matching to assign a unique Census Bureau identifier for each person. The PVS matches incoming files to reference files created with data from the Social Security Administration (SSA) Numerical Identification file, and SSA data with addresses obtained from federal files. This paper describes the PVS methodology from editing input data to creating the final file.
    View Full Paper PDF
  • Working Paper

    Comparison of Survey, Federal, and Commercial Address Data Quality

    June 2014

    Authors: Quentin Brummet

    Working Paper Number:

    carra-2014-06

    This report summarizes matching of survey, commercial, and administrative records housing units to the Census Bureau Master Address File (MAF). We document overall MAF match rates in each data set and evaluate differences in match rates across a variety of housing characteristics. Results show that over 90 percent of records in survey data from the American Housing Survey (AHS) match to the MAF. Commercial data from CoreLogic matches at much lower rates, in part due to missing address information and poor match rates for multi-unit buildings. MAF match rates for administrative records from the Department of Housing and Urban Development are also high, and open the possibility of using this information in surveys such as the AHS.
    View Full Paper PDF
  • Working Paper

    The Nature of the Bias When Studying Only Linkable Person Records: Evidence from the American Community Survey

    April 2014

    Working Paper Number:

    carra-2014-08

    Record linkage across survey and administrative records sources can greatly enrich data and improve their quality. The linkage can reduce respondent burden and nonresponse follow-up costs. This is particularly important in an era of declining survey response rates and tight budgets. Record linkage also creates statistical bias, however. The U.S. Census Bureau links person records through its Person Identification Validation System (PVS), assigning each record a Protected Identification Key (PIK). It is not possible to reliably assign a PIK to every record, either due to insufficient identifying information or because the information does not uniquely match any of the administrative records used in the person validation process. Non-random ability to assign a PIK can potentially inject bias into statistics using linked data. This paper studies the nature of this bias using the 2009 and 2010 American Community Survey (ACS). The ACS is well-suited for this analysis, as it contains a rich set of person characteristics that can describe the bias. We estimate probit models for whether a record is assigned a PIK. The results suggest that young children, minorities, residents of group quarters, immigrants, recent movers, low-income individuals, and non-employed individuals are less likely to receive a PIK using 2009 ACS. Changes to the PVS process in 2010 significantly addressed the young children deficit, attenuated the other biases, and increased the validated records share from 88.1 to 92.6 percent (person-weighted).
    View Full Paper PDF
  • Working Paper

    A FIRST STEP TOWARDS A GERMAN SYNLBD: CONSTRUCTING A GERMAN LONGITUDINAL BUSINESS DATABASE

    February 2014

    Working Paper Number:

    CES-14-13

    One major criticism against the use of synthetic data has been that the efforts necessary to generate useful synthetic data are so in- tense that many statistical agencies cannot afford them. We argue many lessons in this evolving field have been learned in the early years of synthetic data generation, and can be used in the development of new synthetic data products, considerably reducing the required in- vestments. The final goal of the project described in this paper will be to evaluate whether synthetic data algorithms developed in the U.S. to generate a synthetic version of the Longitudinal Business Database (LBD) can easily be transferred to generate a similar data product for other countries. We construct a German data product with infor- mation comparable to the LBD - the German Longitudinal Business Database (GLBD) - that is generated from different administrative sources at the Institute for Employment Research, Germany. In a fu- ture step, the algorithms developed for the synthesis of the LBD will be applied to the GLBD. Extensive evaluations will illustrate whether the algorithms provide useful synthetic data without further adjustment. The ultimate goal of the project is to provide access to multiple synthetic datasets similar to the SynLBD at Cornell to enable comparative studies between countries. The Synthetic GLBD is a first step towards that goal.
    View Full Paper PDF