CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'record'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Internal Revenue Service - 24

Social Security Administration - 22

Protected Identification Key - 22

American Community Survey - 21

Center for Economic Studies - 18

Service Annual Survey - 17

Social Security Number - 15

Person Validation System - 14

National Science Foundation - 13

2010 Census - 13

Census Bureau Disclosure Review Board - 12

North American Industry Classification System - 12

Personally Identifiable Information - 11

Research Data Center - 11

Person Identification Validation System - 10

Social Security - 10

Longitudinal Business Database - 10

Indian Health Service - 9

Master Address File - 9

Standard Industrial Classification - 9

Administrative Records - 9

Center for Administrative Records Research and Applications - 9

Longitudinal Employer Household Dynamics - 8

Current Population Survey - 8

County Business Patterns - 8

Employer Identification Numbers - 8

Business Register - 8

Decennial Census - 7

Department of Housing and Urban Development - 7

Indian Housing Information Center - 7

Housing and Urban Development - 7

Some Other Race - 7

Bureau of Labor Statistics - 7

Economic Census - 7

Federal Statistical Research Data Center - 7

Disclosure Review Board - 6

Quarterly Workforce Indicators - 6

Individual Taxpayer Identification Numbers - 6

Survey of Income and Program Participation - 6

Business Dynamics Statistics - 6

SSA Numident - 6

National Opinion Research Center - 6

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 5

Computer Assisted Personal Interview - 5

CATI - 5

Quarterly Census of Employment and Wages - 5

Census Bureau Person Identification Validation System - 5

Census Numident - 5

Census Bureau Business Register - 5

Annual Survey of Manufactures - 5

MAFID - 5

Cornell University - 5

Medicaid Services - 5

Postal Service - 4

Census Bureau Master Address File - 4

Standard Statistical Establishment List - 4

Company Organization Survey - 4

Centers for Medicare - 4

Chicago Census Research Data Center - 4

Unemployment Insurance - 4

Center for Administrative Records Research - 4

Census of Manufactures - 4

PIKed - 4

Sloan Foundation - 3

Supplemental Nutrition Assistance Program - 3

Census Edited File - 3

Census Household Composition Key - 3

University of Chicago - 3

National Center for Health Statistics - 3

Office of Management and Budget - 3

1940 Census - 3

Department of Economics - 3

Alfred P Sloan Foundation - 3

University of Michigan - 3

COVID-19 - 3

Metropolitan Statistical Area - 3

Longitudinal Research Database - 3

Minnesota Population Center - 3

Local Employment Dynamics - 3

Duke University - 3

data - 36

survey - 26

datasets - 24

respondent - 21

microdata - 20

census bureau - 20

census data - 17

matching - 17

data census - 16

agency - 15

database - 15

report - 13

population - 12

statistical - 12

imputation - 11

records census - 10

irs - 10

census records - 10

linkage - 10

matched - 10

identifier - 9

disclosure - 8

federal - 8

use census - 8

census use - 8

census research - 8

ethnicity - 7

hispanic - 7

estimating - 7

coverage - 7

ssa - 7

department - 7

quarterly - 7

information - 7

confidentiality - 6

privacy - 6

publicly - 6

employed - 6

census survey - 6

citizen - 6

business data - 6

aggregate - 6

sector - 6

firms census - 6

census file - 6

1040 - 5

enrollment - 5

employee - 5

filing - 5

census employment - 5

census linked - 5

residence - 5

payroll - 5

enterprise - 5

longitudinal - 5

analysis - 5

associate - 5

survey data - 5

public - 4

minority - 4

ethnic - 4

job - 4

incorporated - 4

employment data - 4

workforce - 4

race - 4

sampling - 4

discrepancy - 4

race census - 4

linked census - 4

resident - 4

census responses - 4

census 2020 - 4

sample - 4

reporting - 4

employ - 4

employment statistics - 4

researcher - 4

research - 4

2010 census - 4

statistical disclosure - 4

model - 4

industrial - 4

statistical agencies - 4

surveys censuses - 3

tenure - 3

assessed - 3

native - 3

state - 3

migration - 3

migrant - 3

medicare - 3

medicaid - 3

recession - 3

establishments data - 3

manufacturing - 3

statistician - 3

financial - 3

demography - 3

inference - 3

ancestry - 3

econometric - 3

earnings - 3

Viewing papers 21 through 30 of 51


  • Working Paper

    Who Files for Personal Bankruptcy in the United States?

    January 2017

    Authors: Jonathan Fisher

    Working Paper Number:

    CES-17-54

    Who files for bankruptcy in the United States is not well understood. Previous research relied on small samples from national surveys or a small number of states from administrative records. I use over 10 million administrative bankruptcy records linked to the 2000 Decennial Census and the 2001-2009 American Community Surveys to understand who files for personal bankruptcy. Bankruptcy filers are middle income, more likely to be divorced, more likely to be black, more likely to have terminal high school degree or some college, and more likely to be middle-aged. Bankruptcy filers are more likely to be employed than the U.S. as a whole, and they are more likely to be employed 50-52 weeks. The bankruptcy population is aging faster than the U.S. population as a whole. Lastly, using the pseudo-panels I study what happens in the years around bankruptcy. Individuals are likely to get divorced in the years before bankruptcy and then remarry. Income falls before bankruptcy and then rises after bankruptcy.
    View Full Paper PDF
  • Working Paper

    A Comparison of Training Modules for Administrative Records Use in Nonresponse Followup Operations: The 2010 Census and the American Community Survey

    January 2017

    Working Paper Number:

    CES-17-47

    While modeling work in preparation for the 2020 Census has shown that administrative records can be predictive of Nonresponse Followup (NRFU) enumeration outcomes, there is scope to examine the robustness of the models by using more recent training data. The models deployed for workload removal from the 2015 and 2016 Census Tests were based on associations of the 2010 Census with administrative records. Training the same models with more recent data from the American Community Survey (ACS) can identify any changes in parameter associations over time that might reduce the accuracy of model predictions. Furthermore, more recent training data would allow for the incorporation of new administrative record sources not available in 2010. However, differences in ACS methodology and the smaller sample size may limit its applicability. This paper replicates earlier results and examines model predictions based on the ACS in comparison with NRFU outcomes. The evaluation consists of a comparison of predicted counts and household compositions with actual 2015 NRFU outcomes. The main findings are an overall validation of the methodology using independent data.
    View Full Paper PDF
  • Working Paper

    File Matching with Faulty Continuous Matching Variables

    January 2017

    Working Paper Number:

    CES-17-45

    We present LFCMV, a Bayesian file linking methodology designed to link records using continuous matching variables in situations where we do not expect values of these matching variables to agree exactly across matched pairs. The method involves a linking model for the distance between the matching variables of records in one file and the matching variables of their linked records in the second. This linking model is conditional on a vector indicating the links. We specify a mixture model for the distance component of the linking model, as this latent structure allows the distance between matching variables in linked pairs to vary across types of linked pairs. Finally, we specify a model for the linking vector. We describe the Gibbs sampling algorithm for sampling from the posterior distribution of this linkage model and use artificial data to illustrate model performance. We also introduce a linking application using public survey information and data from the U.S. Census of Manufactures and use LFCMV to link the records.
    View Full Paper PDF
  • Working Paper

    Playing with Matches: An Assessment of Accuracy in Linked Historical Data

    June 2016

    Working Paper Number:

    carra-2016-05

    This paper evaluates linkage quality achieved by various record linkage techniques used in historical demography. I create benchmark, or truth, data by linking the 2005 Current Population Survey Annual Social and Economic Supplement to the Social Security Administration's Numeric Identification System by Social Security Number. By comparing simulated linkages to the benchmark data, I examine the value added (in terms of number and quality of links) from incorporating text-string comparators, adjusting age, and using a probabilistic matching algorithm. I find that text-string comparators and probabilistic approaches are useful for increasing the linkage rate, but use of text-string comparators may decrease accuracy in some cases. Overall, probabilistic matching offers the best balance between linkage rates and accuracy.
    View Full Paper PDF
  • Working Paper

    Using Partially Synthetic Microdata to Protect Sensitive Cells in Business Statistics

    February 2016

    Working Paper Number:

    CES-16-10

    We describe and analyze a method that blends records from both observed and synthetic microdata into public-use tabulations on establishment statistics. The resulting tables use synthetic data only in potentially sensitive cells. We describe different algorithms, and present preliminary results when applied to the Census Bureau's Business Dynamics Statistics and Synthetic Longitudinal Business Database, highlighting accuracy and protection afforded by the method when compared to existing public-use tabulations (with suppressions).
    View Full Paper PDF
  • Working Paper

    Measuring Cross-Country Differences in Misallocation

    January 2016

    Working Paper Number:

    CES-16-50R

    We describe differences between the commonly used version of the U.S. Census of Manufactures available at the RDCs and what establishments themselves report. The originally reported data has substantially more dispersion in measured establishment productivity. Measured allocative efficiency is substantially higher in the cleaned data than the raw data: 4x higher in 2002, 20x in 2007, and 80x in 2012. Many of the important editing strategies at the Census, including industry analysts' manual edits and edits using tax records, are infeasible in non-U.S. datasets. We describe a new Bayesian approach for editing and imputation that can be used across contexts.
    View Full Paper PDF
  • Working Paper

    When Race and Hispanic Origin Reporting are Discrepant Across Administrative Records and Third Party Sources: Exploring Methods to Assign Responses

    December 2015

    Working Paper Number:

    carra-2015-08

    The U.S. Census Bureau is researching uses of administrative records and third party data in survey and decennial census operations. One potential use of administrative records is to utilize these data when race and Hispanic origin responses are missing. When federal and third party administrative records are compiled, race and Hispanic origin responses are not always the same for an individual across sources. We explore different methods to assign one race and one Hispanic response when these responses are discrepant. We also describe the characteristics of individuals with matching, non-matching, and missing race and Hispanic origin data by demographic, household, and contextual variables. We find that minorities, especially Hispanics, are more likely to have non-matching Hispanic origin and race responses in administrative records and third party data compared to the 2010 Census. Minority groups and individuals ages 0-17 are more likely to have missing race or Hispanic origin data in administrative records and third party data. Larger households tend to have more missing race data in administrative records and third party data than smaller households.
    View Full Paper PDF
  • Working Paper

    Assessing Coverage and Quality of the 2007 Prototype Census Kidlink Database

    September 2015

    Working Paper Number:

    carra-2015-07

    The Census Bureau is conducting research to expand the use of administrative records data in censuses and surveys to decrease respondent burden and reduce costs while improving data quality. Much of this research (e.g., Rastogi and O''Hara (2012), Luque and Bhaskar (2014)) hinges on the ability to integrate multiple data sources by linking individuals across files. One of the Census Bureau's record linkage methodologies for data integration is the Person Identification Validation System or PVS. PVS assigns anonymous and unique IDs (Protected Identification Keys or PIKs) that serve as linkage keys across files. Prior research showed that integrating 'known associates' information into PVS's reference files could potentially enhance PVS's PIK assignment rates. The term 'known associates' refers to people that are likely to be associated with each other because of a known common link (such as family relationships or people sharing a common address), and thus, to be observed together in different files. One of the results from this prior research was the creation of the 2007 Census Kidlink file, a child-level file linking a child's Social Security Number (SSN) record to the SSN of those identified as the child's parents. In this paper, we examine to what extent the 2007 Census Kidlink methodology was able to link parents SSNs to children SSN records, and also evaluate the quality of those links. We find that in approximately 80 percent of cases, at least one parent was linked to the child's record. Younger children and noncitizens have a higher percentage of cases where neither parent could be linked to the child. Using 2007 tax data as a benchmark, our quality evaluation results indicate that in at least 90 percent of the cases, the parent-child link agreed with those found in the tax data. Based on our findings, we propose improvements to the 2007 Kidlink methodology to increase child-parent links, and discuss how the creation of the file could be operationalized moving forward.
    View Full Paper PDF
  • Working Paper

    Matching Addresses between Household Surveys and Commercial Data

    July 2015

    Authors: Quentin Brummet

    Working Paper Number:

    carra-2015-04

    Matching third-party data sources to household surveys can benefit household surveys in a number of ways, but the utility of these new data sources depends critically on our ability to link units between data sets. To understand this better, this report discusses potential modifications to the existing match process that could potentially improve our matches. While many changes to the matching procedure produce marginal improvements in match rates, substantial increases in match rates can only be achieved by relaxing the definition of a successful match. In the end, the results show that the most important factor determining the success of matching procedures is the quality and composition of the data sets being matched.
    View Full Paper PDF
  • Working Paper

    Coverage and Agreement of Administrative Records and 2010 American Community Survey Demographic Data

    November 2014

    Working Paper Number:

    carra-2014-14

    The U.S. Census Bureau is researching possible uses of administrative records in decennial census and survey operations. The 2010 Census Match Study and American Community Survey (ACS) Match Study represent recent efforts by the Census Bureau to evaluate the extent to which administrative records provide data on persons and addresses in the 2010 Census and 2010 ACS. The 2010 Census Match Study also examines demographic response data collected in administrative records. Building on this analysis, we match data from the 2010 ACS to federal administrative records and third party data as well as to previous census data and examine administrative records coverage and agreement of ACS age, sex, race, and Hispanic origin responses. We find high levels of coverage and agreement for sex and age responses and variable coverage and agreement across race and Hispanic origin groups. These results are similar to findings from the 2010 Census Match Study.
    View Full Paper PDF