CREAT: Census Research Exploration and Analysis Tool

Papers Containing Tag(s): 'Personally Identifiable Information'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Protected Identification Key - 33

American Community Survey - 30

Internal Revenue Service - 27

Social Security Number - 27

Social Security Administration - 24

Person Validation System - 24

Census Bureau Disclosure Review Board - 22

Current Population Survey - 20

Person Identification Validation System - 17

Social Security - 15

2010 Census - 13

Individual Taxpayer Identification Numbers - 11

Supplemental Nutrition Assistance Program - 10

Disclosure Review Board - 10

Some Other Race - 10

Department of Housing and Urban Development - 9

Temporary Assistance for Needy Families - 9

Decennial Census - 9

Census Numident - 9

Master Address File - 9

Administrative Records - 9

Center for Administrative Records Research and Applications - 9

1940 Census - 8

Housing and Urban Development - 7

W-2 - 7

Census Household Composition Key - 7

Computer Assisted Personal Interview - 7

SSA Numident - 7

Center for Economic Studies - 6

Earned Income Tax Credit - 6

Census Edited File - 6

Longitudinal Employer Household Dynamics - 6

Office of Management and Budget - 6

Census Bureau Master Address File - 6

Survey of Income and Program Participation - 5

Adjusted Gross Income - 5

PIKed - 5

MAFID - 5

Federal Statistical Research Data Center - 5

Indian Health Service - 5

Ordinary Least Squares - 5

Bureau of Labor Statistics - 5

Employer Identification Numbers - 5

Business Register - 5

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 4

Indian Housing Information Center - 4

CATI - 4

COVID-19 - 4

NUMIDENT - 4

Citizenship and Immigration Services - 4

Data Management System - 4

Longitudinal Business Database - 4

Department of Health and Human Services - 4

Center for Administrative Records Research - 4

North American Industry Classification System - 4

DOB - 4

National Opinion Research Center - 4

Federal Reserve Bank - 3

Federal Tax Information - 3

National Academy of Sciences - 3

National Center for Health Statistics - 3

Integrated Public Use Microdata Series - 3

Current Population Survey Annual Social and Economic Supplement - 3

Census Bureau Person Identification Validation System - 3

Postal Service - 3

Master Beneficiary Record - 3

Customs and Border Protection - 3

Department of Justice - 3

Cornell Institute for Social and Economic Research - 3

American Housing Survey - 3

Medicaid Services - 3

Social Science Research Institute - 3

Service Annual Survey - 3

Minnesota Population Center - 3

Viewing papers 31 through 40 of 40


  • Working Paper

    Public-Use vs. Restricted-Use: An Analysis Using the American Community Survey

    January 2017

    Working Paper Number:

    CES-17-12

    Statistical agencies frequently publish microdata that have been altered to protect confidentiality. Such data retain utility for many types of broad analyses but can yield biased or Insufficiently precise results in others. Research access to de-identified versions of the restricted-use data with little or no alteration is often possible, albeit costly and time-consuming. We investigate the the advantages and disadvantages of public-use and restricted-use data from the American Community Survey (ACS) in constructing a wage index. The public-use data used were Public Use Microdata Samples, while the restricted-use data were accessed via a Federal Statistical Research Data Center. We discuss the advantages and disadvantages of each data source and compare estimated CWIs and standard errors at the state and labor market levels.
    View Full Paper PDF
  • Working Paper

    Playing with Matches: An Assessment of Accuracy in Linked Historical Data

    June 2016

    Working Paper Number:

    carra-2016-05

    This paper evaluates linkage quality achieved by various record linkage techniques used in historical demography. I create benchmark, or truth, data by linking the 2005 Current Population Survey Annual Social and Economic Supplement to the Social Security Administration's Numeric Identification System by Social Security Number. By comparing simulated linkages to the benchmark data, I examine the value added (in terms of number and quality of links) from incorporating text-string comparators, adjusting age, and using a probabilistic matching algorithm. I find that text-string comparators and probabilistic approaches are useful for increasing the linkage rate, but use of text-string comparators may decrease accuracy in some cases. Overall, probabilistic matching offers the best balance between linkage rates and accuracy.
    View Full Paper PDF
  • Working Paper

    Assessing Coverage and Quality of the 2007 Prototype Census Kidlink Database

    September 2015

    Working Paper Number:

    carra-2015-07

    The Census Bureau is conducting research to expand the use of administrative records data in censuses and surveys to decrease respondent burden and reduce costs while improving data quality. Much of this research (e.g., Rastogi and O''Hara (2012), Luque and Bhaskar (2014)) hinges on the ability to integrate multiple data sources by linking individuals across files. One of the Census Bureau's record linkage methodologies for data integration is the Person Identification Validation System or PVS. PVS assigns anonymous and unique IDs (Protected Identification Keys or PIKs) that serve as linkage keys across files. Prior research showed that integrating 'known associates' information into PVS's reference files could potentially enhance PVS's PIK assignment rates. The term 'known associates' refers to people that are likely to be associated with each other because of a known common link (such as family relationships or people sharing a common address), and thus, to be observed together in different files. One of the results from this prior research was the creation of the 2007 Census Kidlink file, a child-level file linking a child's Social Security Number (SSN) record to the SSN of those identified as the child's parents. In this paper, we examine to what extent the 2007 Census Kidlink methodology was able to link parents SSNs to children SSN records, and also evaluate the quality of those links. We find that in approximately 80 percent of cases, at least one parent was linked to the child's record. Younger children and noncitizens have a higher percentage of cases where neither parent could be linked to the child. Using 2007 tax data as a benchmark, our quality evaluation results indicate that in at least 90 percent of the cases, the parent-child link agreed with those found in the tax data. Based on our findings, we propose improvements to the 2007 Kidlink methodology to increase child-parent links, and discuss how the creation of the file could be operationalized moving forward.
    View Full Paper PDF
  • Working Paper

    The EITC over the business cycle: Who benefits?

    December 2014

    Authors: Maggie R. Jones

    Working Paper Number:

    carra-2014-15

    In this paper, I examine the impact of the Great Recession on Earned Income Tax Credit (EITC) eligibility. Because the EITC is structurally tied to earnings, the direction of this impact is not immediately obvious. Families who experience complete job loss for an entire tax year lose eligibility, while those experiencing underemployment (part-year employment, a reduction in hours, or spousal unemployment in married households) may become eligible. Determining the direction and magnitude of the impact is important for a number of reasons. The EITC has become the largest cash-transfer program in the U.S., and many low-earning families rely on it as a means of support in tough times. The program has largely been viewed as a replacement for welfare, enticing former welfare recipients into the labor force. However, the effectiveness of the EITC during a period of very high unemployment has not been assessed. To answer these questions, I first use the Current Population Survey (CPS) matched to Internal Revenue Service data from tax years 2005 to 2010 to assess patterns of employment and eligibility over the Great Recession for different labor-force groups. Results indicate that overall, EITC eligibility increased over the recession, but only among groups that were cushioned from total household earnings loss by marriage. I also use the 2006 CPS matched to tax data from 2005 through 2011 to examine changes in eligibility experienced by individuals over time. In assessing three competing causes of eligibility loss, I find that less-educated, unmarried women experienced a greater hazard of eligibility loss due a yearlong lack of earnings compared with other labor-market groups. I discuss the implications of these findings on the view of the EITC as a safety-net program.
    View Full Paper PDF
  • Working Paper

    Coverage and Agreement of Administrative Records and 2010 American Community Survey Demographic Data

    November 2014

    Working Paper Number:

    carra-2014-14

    The U.S. Census Bureau is researching possible uses of administrative records in decennial census and survey operations. The 2010 Census Match Study and American Community Survey (ACS) Match Study represent recent efforts by the Census Bureau to evaluate the extent to which administrative records provide data on persons and addresses in the 2010 Census and 2010 ACS. The 2010 Census Match Study also examines demographic response data collected in administrative records. Building on this analysis, we match data from the 2010 ACS to federal administrative records and third party data as well as to previous census data and examine administrative records coverage and agreement of ACS age, sex, race, and Hispanic origin responses. We find high levels of coverage and agreement for sex and age responses and variable coverage and agreement across race and Hispanic origin groups. These results are similar to findings from the 2010 Census Match Study.
    View Full Paper PDF
  • Working Paper

    Do Doubled-up Families Minimize Household-level Tax Burden?

    September 2014

    Working Paper Number:

    carra-2014-13

    This paper examines a method of tax avoidance not previously studied: the sorting of dependent children among related filers who have 'doubled up' in a household for economic reasons. Using the Current Population Survey Annual Social and Economic Supplement (CPS ASEC) linked with 1040 data from the Internal Revenue Service (IRS), we examine households with children and at least two adult tax filers to determine whether the household minimizes income tax burden, and thus maximizes refunds, by optimally claiming dependents. We examine specifically the relationship between the Earned Income Tax Credit (EITC) and the sorting of dependent children among filers in households. We find the following: The propensity to sort increases as the number of filers who are potentially eligible for the EITC increases; sorting probability increases as the optimal household EITC amount increases; and among households with at least one EITC-eligible filer, the propensity to sort increases as the difference between modeled household EITC amount and the optimal amount increases. We also exploit the 2009 change in EITC benefit for families with three or more children, finding that the propensity to sort to exactly three children increased among EITC-eligible filers after the rule change. The results of this analysis improve our understanding of filing behavior, particularly how households form filing units and pool resources, and have implications for poverty measurement in complex households This presentation was given at the CARRA Seminar, July 16, 2014
    View Full Paper PDF
  • Working Paper

    Creating Linked Historical Data: An Assessment of the Census Bureau's Ability to Assign Protected Identification Keys to the 1960 Census

    September 2014

    Working Paper Number:

    carra-2014-12

    In order to study social phenomena over the course of the 20th century, the Census Bureau is investigating the feasibility of digitizing historical census records and linking them to contemporary data. However, historical censuses have limited personally identifiable information available to match on. In this paper, I discuss the problems associated with matching older censuses to contemporary data files, and I describe the matching process used to match a small sample of the 1960 census to the Social Security Administration Numeric Identification System.
    View Full Paper PDF
  • Working Paper

    Person Matching in Historical Files using the Census Bureau's Person Validation System

    September 2014

    Working Paper Number:

    carra-2014-11

    The recent release of the 1940 Census manuscripts enables the creation of longitudinal data spanning the whole of the twentieth century. Linked historical and contemporary data would allow unprecedented analyses of the causes and consequences of health, demographic, and economic change. The Census Bureau is uniquely equipped to provide high quality linkages of person records across datasets. This paper summarizes the linkage techniques employed by the Census Bureau and discusses utilization of these techniques to append protected identification keys to the 1940 Census.
    View Full Paper PDF
  • Working Paper

    Within and Across County Variation in SNAP Misreporting: Evidence from Linked ACS and Administrative Records

    July 2014

    Working Paper Number:

    carra-2014-05

    This paper examines sub-state spatial and temporal variation in misreporting of participation in the Supplemental Nutrition Assistance Program (SNAP) using several years of the American Community Survey linked to SNAP administrative records from New York (2008-2010) and Texas (2006-2009). I calculate county false-negative (FN) and false-positive (FP) rates for each year of observation and find that, within a given state and year, there is substantial heterogeneity in FN rates across counties. In addition, I find evidence that FN rates (but not FP rates) persist over time within counties. This persistence in FN rates is strongest among more populous counties, suggesting that when noise from sampling variation is not an issue, some counties have consistently high FN rates while others have consistently low FN rates. This finding is important for understanding how misreporting might bias estimates of sub-state SNAP participation rates, changes in those participation rates, and effects of program participation. This presentation was given at the CARRA Seminar, June 27, 2013
    View Full Paper PDF
  • Working Paper

    2010 American Community Survey Match Study

    July 2014

    Working Paper Number:

    carra-2014-03

    Using administrative records data from federal government agencies and commercial sources, the 2010 ACS Match Study measures administrative records coverage of 2010 ACS addresses, persons, and persons at addresses at different levels of geography as well as by demographic characteristics and response mode. The 2010 ACS Match Study represents a continuation of the research undertaken in the 2010 Census Match Study, the first national-level evaluation of administrative records data coverage. Preliminary results indicate that administrative records provide substantial coverage for addresses and persons in the 2010 ACS (92.7 and 92.1 percent respectively), and less extensive though substantial coverage, for person-address pairs (74.3 percent). In addition, some variation in address, person and/or person-address coverage is found across demographic and response mode groups. This research informs future uses of administrative records in survey and decennial census operations to address the increasing costs of data collection and declining response rates.
    View Full Paper PDF