CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'income data'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Viewing papers 1 through 10 of 15


  • Working Paper

    Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children

    January 2025

    Working Paper Number:

    CES-25-03

    Researchers and practitioners increasingly rely on administrative data sources to measure family income. However, administrative data sources are often incomplete in their coverage of the population, giving rise to potential bias in family income measures, particularly if coverage deficiencies are not well understood. We focus on the school-aged child population, due to its particular import to research and policy, and because of the unique challenges of linking children to family income information. We find that two of the most significant administrative sources of family income information that permit linking of children and parents'IRS Form 1040 and SNAP participation records'usefully complement each other, potentially reducing coverage bias when used together. In a case study considering how best to measure economic disadvantage rates in the public school student population, we demonstrate the sensitivity of family income statistics to assumptions about individuals who do not appear in administrative data sources.
    View Full Paper PDF
  • Working Paper

    Incorporating Administrative Data in Survey Weights for the 2018-2022 Survey of Income and Program Participation

    October 2024

    Working Paper Number:

    CES-24-58

    Response rates to the Survey of Income and Program Participation (SIPP) have declined over time, raising the potential for nonresponse bias in survey estimates. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we modify various parts of the SIPP weighting algorithm to incorporate such data. We create these new weights for the 2018 through 2022 SIPP panels and examine how the new weights affect survey estimates. Our results show that before weighting adjustments, SIPP respondents in these panels have higher socioeconomic status than the general population. Existing weighting procedures reduce many of these differences. Comparing SIPP estimates between the production weights and the administrative data-based weights yields changes that are not uniform across the joint income and program participation distribution. Unlike other Census Bureau household surveys, there is no large increase in nonresponse bias in SIPP due to the COVID-19 Pandemic. In summary, the magnitude and sign of nonresponse bias in SIPP is complicated, and the existing weighting procedures may change the sign of nonresponse bias for households with certain incomes and program benefit statuses.
    View Full Paper PDF
  • Working Paper

    Mobility, Opportunity, and Volatility Statistics (MOVS): Infrastructure Files and Public Use Data

    April 2024

    Working Paper Number:

    CES-24-23

    Federal statistical agencies and policymakers have identified a need for integrated systems of household and personal income statistics. This interest marks a recognition that aggregated measures of income, such as GDP or average income growth, tell an incomplete story that may conceal large gaps in well-being between different types of individuals and families. Until recently, longitudinal income data that are rich enough to calculate detailed income statistics and include demographic characteristics, such as race and ethnicity, have not been available. The Mobility, Opportunity, and Volatility Statistics project (MOVS) fills this gap in comprehensive income statistics. Using linked demographic and tax records on the population of U.S. working-age adults, the MOVS project defines households and calculates household income, applying an equivalence scale to create a personal income concept, and then traces the progress of individuals' incomes over time. We then output a set of intermediate statistics by race-ethnicity group, sex, year, base-year state of residence, and base-year income decile. We select the intermediate statistics most useful in developing more complex intragenerational income mobility measures, such as transition matrices, income growth curves, and variance-based volatility statistics. We provide these intermediate statistics as part of a publicly released data tool with downloadable flat files and accompanying documentation. This paper describes the data build process and the output files, including a brief analysis highlighting the structure and content of our main statistics.
    View Full Paper PDF
  • Working Paper

    Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey

    January 2024

    Working Paper Number:

    CES-24-02

    Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations. After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics. This paper is for research purposes only. No changes to production are being implemented at this time.
    View Full Paper PDF
  • Working Paper

    Self-Employment Income Reporting on Surveys

    April 2023

    Working Paper Number:

    CES-23-19

    We examine the relation between administrative income data and survey reports for self-employed and wage-earning respondents from 2000 - 2015. The self-employed report 40 percent more wages and self-employment income in the survey than in tax administrative records; this estimate nets out differences between these two sources that are also shared by wage-earners. We provide evidence that differential reporting incentives are an important explanation of the larger self-employed gap by exploiting a well-known artifact ' self-employed respondents exhibit substantial bunching at the first EITC kink in their administrative records. We do not observe the same behavior in their survey responses even after accounting for survey measurement concerns.
    View Full Paper PDF
  • Working Paper

    National Experimental Wellbeing Statistics - Version 1

    February 2023

    Working Paper Number:

    CES-23-04

    This is the U.S. Census Bureau's first release of the National Experimental Wellbeing Statistics (NEWS) project. The NEWS project aims to produce the best possible estimates of income and poverty given all available survey and administrative data. We link survey, decennial census, administrative, and third-party data to address measurement error in income and poverty statistics. We estimate improved (pre-tax money) income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research. We address biases from 1) unit nonresponse through improved weights, 2) missing income information in both survey and administrative data through improved imputation, and 3) misreporting by combining or replacing survey responses with administrative information. Reducing survey error substantially affects key measures of well-being: We estimate median household income is 6.3 percent higher than in survey estimates, and poverty is 1.1 percentage points lower. These changes are driven by subpopulations for which survey error is particularly relevant. For house holders aged 65 and over, median household income is 27.3 percent higher and poverty is 3.3 percentage points lower than in survey estimates. We do not find a significant impact on median household income for householders under 65 or on child poverty. Finally, we discuss plans for future releases: addressing other potential sources of bias, releasing additional years of statistics, extending the income concepts measured, and including smaller geographies such as state and county.
    View Full Paper PDF
  • Working Paper

    Introducing the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR): Description, Data Construction Methodology, and Quality Assessment

    August 2022

    Working Paper Number:

    CES-22-29

    This report introduces a new dataset, the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR), consisting of MEPS-IC survey data on establishments and their health insurance benefits packages linked to Decennial Census data and administrative tax records on MEPS-IC establishments' workforces. These data include new measures of the characteristics of MEPS-IC establishments' parent firms, employee turnover, the full distribution of MEPS-IC workers' personal and family incomes, the geographic locations where those workers live, and improved workforce demographic detail. Next, this report details the methods used for producing the MEPS-ICAR. Broadly, the linking process begins by matching establishments' parent firms to their workforces using identifiers appearing in tax records. The linking process concludes by matching establishments to their own workforces by identifying the subset of their parent firm's workforce that best matches the expected size, total payroll, and residential geographic distribution of the establishment's workforce. Finally, this report presents statistics characterizing the match rate and the MEPS-ICAR data itself. Key results include that match rates are consistently high (exceeding 90%) across nearly all data subgroups and that the matched data exhibit a reasonable distribution of employment, payroll, and worker commute distances relative to expectations and external benchmarks. Notably, employment measures derived from tax records, but not used in the match itself, correspond with high fidelity to the employment levels that establishments report in the MEPS-IC. Cumulatively, the construction of the MEPS-ICAR significantly expands the capabilities of the MEPS-IC and presents many opportunities for analysts.
    View Full Paper PDF
  • Working Paper

    Investigating the Use of Administrative Records in the Consumer Expenditure Survey

    March 2018

    Working Paper Number:

    carra-2018-01

    In this paper, we investigate the potential of applying administrative records income data to the Consumer Expenditure (CE) survey to inform measurement error properties of CE estimates, supplement respondent-collected data, and estimate the representativeness of the CE survey by income level. We match individual responses to Consumer Expenditure Quarterly Interview Survey data collected from July 2013 through December 2014 to IRS administrative data in order to analyze CE questions on wages, social security payroll deductions, self-employment income receipt and retirement income. We find that while wage amounts are largely in alignment between the CE and administrative records in the middle of the wage distribution, there is evidence that wages are over-reported to the CE at the bottom of the wage distribution and under-reported at the top of the wage distribution. We find mixed evidence for alignment between the CE and administrative records on questions covering payroll deductions and self-employment income receipt, but find substantial divergence between CE responses and administrative records when examining retirement income. In addition to the analysis using person-based linkages, we also match responding and non-responding CE sample units to the universe of IRS 1040 tax returns by address to examine non-response bias. We find that non-responding households are substantially richer than responding households, and that very high income households are less likely to respond to the CE.
    View Full Paper PDF
  • Working Paper

    Capturing More Than Poverty: School Free and Reduced-Price Lunch Data and Household Income

    December 2017

    Working Paper Number:

    carra-2017-09

    Educational researchers often use National School Lunch Program (NSLP) data as a proxy for student poverty. Under NSLP policy, students whose household income is less than 130 percent of the poverty line qualify for free lunch and students whose household income is between 130 percent and 185 percent of the poverty line qualify for reduced-price lunch. Linking school administrative records for all 8th graders in a California public school district to household-level IRS income tax data, we examine how well NSLP data capture student disadvantage. We find both that there is substantial disadvantage in household income not captured by NSLP category data, and that NSLP categories capture disadvantage on test scores above and beyond household income.
    View Full Paper PDF
  • Working Paper

    Measuring Inequality Using Censored Data: A Multiple Imputation Approach

    April 2009

    Working Paper Number:

    CES-09-05

    To measure income inequality with right censored (topcoded) data, we propose multiple imputation for censored observations using draws from Generalized Beta of the Second Kind distributions to provide partially synthetic datasets analyzed using complete data methods. Estimation and inference uses Reiter's (Survey Methodology 2003) formulae. Using Current Population Survey (CPS) internal data, we find few statistically significant differences in income inequality for pairs of years between 1995 and 2004. We also show that using CPS public use data with cell mean imputations may lead to incorrect inferences about inequality differences. Multiply-imputed public use data provide an intermediate solution.
    View Full Paper PDF