CREAT: Census Research Exploration and Analysis Tool

Papers Containing Tag(s): 'Personally Identifiable Information'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Protected Identification Key - 31

American Community Survey - 27

Social Security Number - 26

Social Security Administration - 24

Internal Revenue Service - 24

Person Validation System - 23

Census Bureau Disclosure Review Board - 18

Current Population Survey - 18

Person Identification Validation System - 16

Social Security - 14

2020 Census - 13

Disclosure Review Board - 10

Individual Taxpayer Identification Numbers - 10

Administrative Records - 10

Center for Administrative Records Research and Applications - 10

Department of Housing and Urban Development - 9

Temporary Assistance for Needy Families - 9

Decennial Census - 9

Supplemental Nutrition Assistance Program - 9

Census Numident - 9

Some Other Race - 9

Master Address File - 9

SSA Numident - 8

Housing and Urban Development - 7

W-2 - 7

Census Household Composition Key - 7

Computer Assisted Personal Interview - 7

1990 Census - 7

Longitudinal Employer Household Dynamics - 6

Census Bureau Master Address File - 6

Earned Income Tax Credit - 5

Census Edited File - 5

PIKed - 5

Office of Management and Budget - 5

Indian Health Service - 5

Ordinary Least Squares - 5

Bureau of Labor Statistics - 5

Employer Identification Number - 5

Business Register - 5

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 4

Indian Housing Information Center - 4

Adjusted Gross Income - 4

COVID-19 - 4

Center for Economic Studies - 4

Federal Statistical Research Data Center - 4

MAFID - 4

NUMIDENT - 4

Survey of Income and Program Participation - 4

Data Management System - 4

Longitudinal Business Database - 4

Department of Health and Human Services - 4

Center for Administrative Records Research - 4

North American Industry Classification System - 4

National Opinion Research Center - 4

National Center for Health Statistics - 3

Integrated Public Use Microdata Series - 3

Current Population Survey Annual Social and Economic Supplement - 3

Census Bureau Person Identification Validation System - 3

Postal Service - 3

Master Beneficiary Record - 3

Customs and Border Protection - 3

Department of Justice - 3

Cornell Institute for Social and Economic Research - 3

American Housing Survey - 3

Medicaid Services - 3

Social Science Research Institute - 3

Service Annual Survey - 3

Minnesota Population Center - 3

Viewing papers 1 through 10 of 37


  • Working Paper

    From Marcy to Madison Square? The Effects of Growing Up in Public Housing on Early Adulthood Outcomes

    November 2024

    Working Paper Number:

    CES-24-67

    This paper studies the effects of growing up in public housing in New York City on children's long-run outcomes. Using linked administrative data, we exploit variation in the age children move into public housing to estimate the effects of spending an additional year of childhood in public housing on a range of economic and social outcomes in early adulthood. We find that childhood exposure to public housing improves labor market outcomes and reduces participation in federal safety net programs, particularly for children from the most disadvantaged families. Additionally, we find there is some heterogeneity in impacts across public housing developments. Developments located in neighborhoods with relatively fewer renters and higher household incomes are better for children overall. Our estimate of the marginal value of public funds suggests that for every $1 the government spends per child on public housing, children receive $1.40 in benefits, including $2.30 for children from the most disadvantaged families.
    View Full Paper PDF
  • Working Paper

    Comparison of Child Reporting in the American Community Survey and Federal Income Tax Returns Based on California Birth Records

    September 2024

    Authors: Gloria Aldana

    Working Paper Number:

    CES-24-55

    This paper takes advantage of administrative records from California, a state with a large child population and a significant historical undercount of children in Census Bureau data, dependent information in the Internal Revenue Service (IRS) Form 1040 records, and the American Community Survey to characterize undercounted children and compare child reporting. While IRS Form 1040 records offer potential utility for adjusting child undercounting in Census Bureau surveys, this analysis finds overlapping reporting issues among various demographic and economic groups. Specifically, older children, those of Non-Hispanic Black mothers and Hispanic mothers, children or parents with lower English proficiency, children whose mothers did not complete high school, and families with lower income-to-poverty ratio were less frequently reported in IRS 1040 records than other groups. Therefore, using IRS 1040 dependent records may have limitations for accurately representing populations with characteristics associated with the undercount of children in surveys.
    View Full Paper PDF
  • Working Paper

    Internal Migration in the U.S. During the COVID-19 Pandemic

    September 2024

    Working Paper Number:

    CES-24-50

    Survey and administrative internal migration data disagree on whether the COVID-19 pandemic increased or decreased mobility in the U.S. Moreover, though scholars have theorized and documented migration in response to environmental hazards and economic shocks, the novel conditions posed by a global pandemic make it difficult to hypothesize whether and how American migration might change as a result. We link individual-level data from the United States Postal Service's National Change of Address (NCOA) registry to American Community Survey (ACS) and Current Population Survey (CPS-ASEC) responses and other administrative records to document changes in the level, geography, and composition of migrant flows between 2019 and 2021. We find a 2% increase in address changes between 2019 and 2020, representing an additional 603,000 moves, driven primarily by young adults, earners at the extremes of the income distribution, and individuals (as opposed to families) moving over longer distances. Though the number of address changes returned to pre-pandemic levels in 2021, the pandemic-era geographic and compositional shifts in favor of longer distance moves away from the Pacific and Mid-Atlantic regions toward the South and in favor of younger, individual movers persisted. We also show that at least part of the disconnect between survey, media, and administrative/third-party migration data sources stems from the apparent misreporting of address changes on Census Bureau surveys. Among ACS and CPS-ASEC householders linked to NCOA data and filing a permanent change of address in their 1-year survey response reference period, only around 68% of ACS and 49% of CPS-ASEC householders also reported living in a different residence one year ago in their survey response.
    View Full Paper PDF
  • Working Paper

    Gradient Boosting to Address Statistical Problems Arising from Non-Linkage of Census Bureau Datasets

    June 2024

    Working Paper Number:

    CES-24-27

    This article introduces the twangRDC package, which contains functions to address non-linkage in US Census Bureau datasets. The Census Bureau's Person Identification Validation System facilitates data linkage by assigning unique person identifiers to federal, third party, decennial census, and survey data. Not all records in these datasets can be linked to the reference file and as such not all records will be assigned an identifier. This article is a tutorial for using the twangRDC to generate nonresponse weights to account for non-linkage of person records across US Census Bureau datasets.
    View Full Paper PDF
  • Working Paper

    The 2010 Census Confidentiality Protections Failed, Here's How and Why

    December 2023

    Working Paper Number:

    CES-23-63

    Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act.
    View Full Paper PDF
  • Working Paper

    Producing U.S. Population Statistics Using Multiple Administrative Sources

    November 2023

    Working Paper Number:

    CES-23-58

    We identify several challenges encountered when constructing U.S. administrative record-based (AR-based) population estimates for 2020. Though the AR estimates are higher than the 2020 Census at the national level, they are over 15 percent lower in 5 percent of counties, suggesting that locational accuracy can be improved. Other challenges include how to achieve comprehensive coverage, maintain consistent coverage across time, filter out nonresidents and people not alive on the reference date, uncover missing links across person and address records, and predict demographic characteristics when multiple ones are reported or when they are missing. We discuss several ways of addressing these issues, e.g., building in redundancy with more sources, linking children to their parents' addresses, and conducting additional record linkage for people without Social Security Numbers and for addresses not initially linked to the Census Bureau's Master Address File. We discuss modeling to predict lower levels of geography for people lacking those geocodes, the probability that a person is a U.S. resident on the reference date, the probability that an address is the person's residence on the reference date, and the probability a person is in each demographic characteristic category. Regression results illustrate how many of these challenges and solutions affect the AR county population estimates.
    View Full Paper PDF
  • Working Paper

    Coverage of Children in the American Community Survey Based on California Birth Records

    September 2023

    Authors: Gloria Aldana

    Working Paper Number:

    CES-23-46

    The U.S. Census Bureau's American Community Survey (ACS) collects information on individuals and households. The ACS provides survey-based estimates of children drawn from a sample of the U.S. population. However, survey responses may not match administrative records, such as birth records. Birth records should provide a complete account of all births, along with child-parent relationships and demographic characteristics. California is a state that has both a large population of children and a high undercount for young children. This paper uses California as a case study to examine differences between reported versus unreported children in the ACS based on state birth records. Child reporting rates were lower for more recent data years, younger children, for Black and Hispanic mothers, and for more complex households. Child reporting rates were higher for more educated mothers and for households above the poverty line. Using mother's race and Hispanic ethnicity from the birth records combined with poverty indices from the ACS, this analysis also finds that child reporting does not uniformly vary with poverty status across all race and ethnicity groups. This research builds support for the utility of state birth records in analyzing the undercount of children.
    View Full Paper PDF
  • Working Paper

    Noncitizen Coverage and Its Effects on U.S. Population Statistics

    August 2023

    Working Paper Number:

    CES-23-42

    We produce population estimates with the same reference date, April 1, 2020, as the 2020 Census of Population and Housing by combining 31 types of administrative record (AR) and third-party sources, including several new to the Census Bureau with a focus on noncitizens. Our AR census national population estimate is higher than other Census Bureau official estimates: 1.8% greater than the 2020 Demographic Analysis high estimate, 3.0% more than the 2020 Census count, and 3.6% higher than the vintage-2020 Population Estimates Program estimate. Our analysis suggests that inclusion of more noncitizens, especially those with unknown legal status, explains the higher AR census estimate. About 19.8% of AR census noncitizens have addresses that cannot be linked to an address in the 2020 Census collection universe, compared to 5.7% of citizens, raising the possibility that the 2020 Census did not collect data for a significant fraction of noncitizens residing in the United States under the residency criteria used for the census. We show differences in estimates by age, sex, Hispanic origin, geography, and socioeconomic characteristics symptomatic of the differences in noncitizen coverage.
    View Full Paper PDF
  • Working Paper

    Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report

    April 2023

    Working Paper Number:

    CES-23-21

    This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.
    View Full Paper PDF
  • Working Paper

    Self-Employment Income Reporting on Surveys

    April 2023

    Working Paper Number:

    CES-23-19

    We examine the relation between administrative income data and survey reports for self-employed and wage-earning respondents from 2000 - 2015. The self-employed report 40 percent more wages and self-employment income in the survey than in tax administrative records; this estimate nets out differences between these two sources that are also shared by wage-earners. We provide evidence that differential reporting incentives are an important explanation of the larger self-employed gap by exploiting a well-known artifact ' self-employed respondents exhibit substantial bunching at the first EITC kink in their administrative records. We do not observe the same behavior in their survey responses even after accounting for survey measurement concerns.
    View Full Paper PDF