CREAT: Census Research Exploration and Analysis Tool

Papers Containing Tag(s): 'Person Identification Validation System'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Protected Identification Key - 42

Person Validation System - 39

Internal Revenue Service - 32

American Community Survey - 29

Social Security Number - 27

Social Security Administration - 24

Census Bureau Disclosure Review Board - 23

Current Population Survey - 22

Social Security - 18

Personally Identifiable Information - 17

Center for Administrative Records Research and Applications - 16

Census Numident - 14

2010 Census - 14

Disclosure Review Board - 12

Supplemental Nutrition Assistance Program - 11

Individual Taxpayer Identification Numbers - 11

Department of Housing and Urban Development - 10

Housing and Urban Development - 10

Office of Management and Budget - 10

Earned Income Tax Credit - 9

Decennial Census - 9

1940 Census - 9

Indian Housing Information Center - 8

Medicaid Services - 8

Some Other Race - 8

Ordinary Least Squares - 8

SSA Numident - 8

Master Address File - 7

Census Edited File - 7

Longitudinal Employer Household Dynamics - 7

Indian Health Service - 7

Centers for Medicare - 7

Temporary Assistance for Needy Families - 7

National Opinion Research Center - 7

Census Household Composition Key - 6

Survey of Income and Program Participation - 6

Service Annual Survey - 6

Adjusted Gross Income - 5

Data Management System - 5

Computer Assisted Personal Interview - 5

New York University - 5

National Bureau of Economic Research - 5

W-2 - 5

Administrative Records - 5

Center for Administrative Records Research - 5

Department of Education - 4

Census Bureau Master Address File - 4

Bureau of Labor Statistics - 4

Federal Statistical Research Data Center - 4

Journal of Economic Literature - 4

Detailed Earnings Records - 4

Disability Insurance - 4

Social and Economic Supplement - 4

ASEC - 4

Social Science Research Institute - 4

University of Chicago - 4

Federal Poverty Level - 4

Center for Economic Studies - 4

MAFID - 4

Department of Homeland Security - 4

Postal Service - 4

Cornell Institute for Social and Economic Research - 4

American Housing Survey - 4

Longitudinal Business Database - 4

Employer Identification Numbers - 4

Department of Health and Human Services - 4

PIKed - 4

Census 2000 - 4

Current Population Survey Annual Social and Economic Supplement - 3

National Institute on Aging - 3

Centers for Disease Control and Prevention - 3

Department of Labor - 3

Department of Justice - 3

National Center for Health Statistics - 3

Business Register - 3

Census Bureau Person Identification Validation System - 3

Viewing papers 1 through 10 of 45


  • Working Paper

    Peer Income Exposure Across the Income Distribution

    February 2025

    Working Paper Number:

    CES-25-16

    Children from families across the income distribution attend public schools, making schools and classrooms potential sites for interaction between more- and less-affluent children. However, limited information exists regarding the extent of economic integration in these contexts. We merge educational administrative data from Oregon with measures of family income derived from IRS records to document student exposure to economically diverse school and classroom peers. Our findings indicate that affluent children in public schools are relatively isolated from their less affluent peers, while low- and middle-income students experience relatively even peer income distributions. Students from families in the top percentile of the income distribution attend schools where 20 percent of their peers, on average, come from the top five income percentiles. A large majority of the differences in peer exposure that we observe arise from the sorting of students across schools; sorting across classrooms within schools plays a substantially smaller role.
    View Full Paper PDF
  • Working Paper

    Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children

    January 2025

    Working Paper Number:

    CES-25-03

    Researchers and practitioners increasingly rely on administrative data sources to measure family income. However, administrative data sources are often incomplete in their coverage of the population, giving rise to potential bias in family income measures, particularly if coverage deficiencies are not well understood. We focus on the school-aged child population, due to its particular import to research and policy, and because of the unique challenges of linking children to family income information. We find that two of the most significant administrative sources of family income information that permit linking of children and parents'IRS Form 1040 and SNAP participation records'usefully complement each other, potentially reducing coverage bias when used together. In a case study considering how best to measure economic disadvantage rates in the public school student population, we demonstrate the sensitivity of family income statistics to assumptions about individuals who do not appear in administrative data sources.
    View Full Paper PDF
  • Working Paper

    The Census Historical Environmental Impacts Frame

    October 2024

    Working Paper Number:

    CES-24-66

    The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
    View Full Paper PDF
  • Working Paper

    Comparison of Child Reporting in the American Community Survey and Federal Income Tax Returns Based on California Birth Records

    September 2024

    Authors: Gloria G. Aldana

    Working Paper Number:

    CES-24-55

    This paper takes advantage of administrative records from California, a state with a large child population and a significant historical undercount of children in Census Bureau data, dependent information in the Internal Revenue Service (IRS) Form 1040 records, and the American Community Survey to characterize undercounted children and compare child reporting. While IRS Form 1040 records offer potential utility for adjusting child undercounting in Census Bureau surveys, this analysis finds overlapping reporting issues among various demographic and economic groups. Specifically, older children, those of Non-Hispanic Black mothers and Hispanic mothers, children or parents with lower English proficiency, children whose mothers did not complete high school, and families with lower income-to-poverty ratio were less frequently reported in IRS 1040 records than other groups. Therefore, using IRS 1040 dependent records may have limitations for accurately representing populations with characteristics associated with the undercount of children in surveys.
    View Full Paper PDF
  • Working Paper

    Household Wealth and Entrepreneurial Career Choices: Evidence from Climate Disasters

    July 2024

    Authors: Xiao Cen

    Working Paper Number:

    CES-24-39

    This study investigates how household wealth affects the human capital of startups, based on U.S. Census individual-level employment data, deed records, and geographic information system (GIS) data. Using floods as a wealth shock, a regression discontinuity analysis shows inundated residents are 7% less likely to work in startups relative to their neighbors outside the flood boundary, within a 0.1-mile-wide band. The effect is more pronounced for homeowners, consistent with the wealth effect. The career distortion leads to a significant long-run income loss, highlighting the importance of self-insurance for human capital allocation.
    View Full Paper PDF
  • Working Paper

    Measuring Income of the Aged in Household Surveys: Evidence from Linked Administrative Records

    June 2024

    Working Paper Number:

    CES-24-32

    Research has shown that household survey estimates of retirement income (defined benefit pensions and defined contribution account withdrawals) suffer from substantial underreporting which biases downward measures of financial well-being among the aged. Using data from both the redesigned 2016 Current Population Survey Annual Social and Economic Supplement (CPS ASEC) and the Health and Retirement Study (HRS), each matched with administrative records, we examine to what extent underreporting of retirement income affects key statistics such as reliance on Social Security benefits and poverty among the aged. We find that underreporting of retirement income is still prevalent in the CPS ASEC. While the HRS does a better job than the CPS ASEC in terms of capturing retirement income, it still falls considerably short compared to administrative records. Consequently, the relative importance of Social Security income remains overstated in household surveys'53 percent of elderly beneficiaries in the CPS ASEC and 49 percent in the HRS rely on Social Security for the majority of their incomes compared to 42 percent in the linked administrative data. The poverty rate for those aged 65 and over is also overstated'8.8 percent in the CPS ASEC and 7.4 percent in the HRS compared to 6.4 percent in the linked administrative data. Our results illustrate the effects of using alternative data sources in producing key statistics from the Social Security Administration's Income of the Aged publication.
    View Full Paper PDF
  • Working Paper

    Revisiting Methods to Assign Responses when Race and Hispanic Origin Reporting are Discrepant Across Administrative Records and Third Party Sources

    May 2024

    Authors: James M. Noon

    Working Paper Number:

    CES-24-26

    The Best Race and Ethnicity Administrative Records Composite file ('Best Race file') is an composite file which combines Census, federal, and Third Party Data (TPD) sources and applies business rules to assign race and ethnicity values to person records. The first version of the Best Race administrative records composite was first constructed in 2015 and subsequently updated each year to include more recent vintages, when available, of the data sources originally included in the composite file. Where updates were available for data sources, the most recent information for persons was retained, and the business rules were reapplied to assign a single race and single Hispanic origin value to each person record. The majority of person records on the Best Race file have consistent race and ethnicity information across data sources. Where there are discrepancies in responses across data sources, we apply a series of business rules to assign a single race and ethnicity to each record. To improve the quality of the Best Race administrative records composite, we have begun revising the business rules which were developed several years ago. This paper discusses the original business rules as well as the implemented changes and their impact on the composite file.
    View Full Paper PDF
  • Working Paper

    Mobility, Opportunity, and Volatility Statistics (MOVS): Infrastructure Files and Public Use Data

    April 2024

    Working Paper Number:

    CES-24-23

    Federal statistical agencies and policymakers have identified a need for integrated systems of household and personal income statistics. This interest marks a recognition that aggregated measures of income, such as GDP or average income growth, tell an incomplete story that may conceal large gaps in well-being between different types of individuals and families. Until recently, longitudinal income data that are rich enough to calculate detailed income statistics and include demographic characteristics, such as race and ethnicity, have not been available. The Mobility, Opportunity, and Volatility Statistics project (MOVS) fills this gap in comprehensive income statistics. Using linked demographic and tax records on the population of U.S. working-age adults, the MOVS project defines households and calculates household income, applying an equivalence scale to create a personal income concept, and then traces the progress of individuals' incomes over time. We then output a set of intermediate statistics by race-ethnicity group, sex, year, base-year state of residence, and base-year income decile. We select the intermediate statistics most useful in developing more complex intragenerational income mobility measures, such as transition matrices, income growth curves, and variance-based volatility statistics. We provide these intermediate statistics as part of a publicly released data tool with downloadable flat files and accompanying documentation. This paper describes the data build process and the output files, including a brief analysis highlighting the structure and content of our main statistics.
    View Full Paper PDF
  • Working Paper

    The Long-Term Effects of Income for At-Risk Infants: Evidence from Supplemental Security Income

    March 2024

    Working Paper Number:

    CES-24-10

    This paper examines whether a generous cash intervention early in life can "undo" some of the long-term disadvantage associated with poor health at birth. We use new linkages between several large-scale administrative datasets to examine the short-, medium-, and long-term effects of providing low-income families with low birthweight infants support through the Supplemental Security Income (SSI) program. This program uses a birthweight cutoff at 1200 grams to determine eligibility. We find that families of infants born just below this cutoff experience a large increase in cash benefits totaling about 27%of family income in the first three years of the infant's life. These cash benefits persist at lower amounts through age 10. Eligible infants also experience a small but statistically significant increase in Medicaid enrollment during childhood. We examine whether this support affects health care use and mortality in infancy, educational performance in high school, post-secondary school attendance and college degree attainment, and earnings, public assistance use, and mortality in young adulthood for all infants born in California to low-income families whose birthweight puts them near the cutoff. We also examine whether these payments had spillover effects onto the older siblings of these infants who may have also benefited from the increase in family resources. Despite the comprehensive nature of this early life intervention, we detect no improvements in any of the study outcomes, nor do we find improvements among the older siblings of these infants. These null effects persist across several subgroups and alternative model specifications, and, for some outcomes, our estimates are precise enough to rule out published estimates of the effect of early life cash transfers in other settings.
    View Full Paper PDF
  • Working Paper

    A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report

    December 2023

    Working Paper Number:

    CES-23-63R

    For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act. You are reading the full technical report. For the summary paper see https://doi.org/10.1162/99608f92.4a1ebf70.
    View Full Paper PDF