Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children
January 2025
Working Paper Number:
CES-25-03
Abstract
Document Tags and Keywords
Keywords
Keywords are automatically generated using KeyBERT, a powerful and innovative
keyword extraction tool that utilizes BERT embeddings to ensure high-quality and contextually relevant
keywords.
By analyzing the content of working papers, KeyBERT identifies terms and phrases that capture the essence of the
text, highlighting the most significant topics and trends. This approach not only enhances searchability but
provides connections that go beyond potentially domain-specific author-defined keywords.
:
data,
survey,
disclosure,
respondent,
information,
disadvantaged,
income individuals,
population,
tax,
enrollment,
poverty,
census bureau,
irs,
coverage,
filing,
parent,
dependent,
family,
family income,
household income,
taxpayer,
income data,
income households,
enrolled,
income children,
1040
Tags
Tags are automatically generated using a pretrained language model from spaCy, which excels at
several tasks, including entity tagging.
The model is able to label words and phrases by part-of-speech,
including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are
identified to contain references to specific institutions, datasets, and other organizations.
:
Internal Revenue Service,
Department of Education,
American Community Survey,
Social Security Number,
Protected Identification Key,
Earned Income Tax Credit,
Census Bureau Disclosure Review Board,
Person Validation System,
Supplemental Nutrition Assistance Program,
Person Identification Validation System,
Personally Identifiable Information
Similar Working Papers
Similarity between working papers are determined by an unsupervised neural
network model
know as Doc2Vec.
Doc2Vec is a model that represents entire documents as fixed-length vectors, allowing for the
capture of semantic meaning in a way that relates to the context of words within the document. The model learns to
associate a unique vector with each document while simultaneously learning word vectors, enabling tasks such as
document classification, clustering, and similarity detection by preserving the order and structure of words. The
document vectors are compared using cosine similarity/distance to determine the most similar working papers.
Papers identified with 🔥 are in the top 20% of similarity.
The 10 most similar working papers to the working paper 'Potential Bias When Using Administrative Data to Measure the Family Income of School-Aged Children' are listed below in order of similarity.
-
Working PaperMeasuring School Economic Disadvantage🔥
November 2022
Working Paper Number:
CES-22-50R
Many educational policies hinge on the valid measurement of student economic disadvantage at the school level. Measures based on free and reduced-price lunch enrollment are used widely. However, recent research raises questions about their reliability, particularly following the introduction of universal free lunch in certain schools and districts. Using unique data linking the universe of students in Oregon public schools to IRS tax records and other data housed at the U.S. Census Bureau, we provide the first examination of how well different measures capture school economic disadvantage. We find that, in Oregon, direct certification provides the best widely-available measure, both over time and across the distribution of school economic disadvantage. By contrast, neighborhood-based measures consistently perform relatively poorly.View Full Paper PDF
-
Working PaperCapturing More Than Poverty: School Free and Reduced-Price Lunch Data and Household Income🔥
December 2017
Working Paper Number:
carra-2017-09
Educational researchers often use National School Lunch Program (NSLP) data as a proxy for student poverty. Under NSLP policy, students whose household income is less than 130 percent of the poverty line qualify for free lunch and students whose household income is between 130 percent and 185 percent of the poverty line qualify for reduced-price lunch. Linking school administrative records for all 8th graders in a California public school district to household-level IRS income tax data, we examine how well NSLP data capture student disadvantage. We find both that there is substantial disadvantage in household income not captured by NSLP category data, and that NSLP categories capture disadvantage on test scores above and beyond household income.View Full Paper PDF
-
Working PaperPeer Income Exposure Across the Income Distribution🔥
February 2025
Working Paper Number:
CES-25-16
Children from families across the income distribution attend public schools, making schools and classrooms potential sites for interaction between more- and less-affluent children. However, limited information exists regarding the extent of economic integration in these contexts. We merge educational administrative data from Oregon with measures of family income derived from IRS records to document student exposure to economically diverse school and classroom peers. Our findings indicate that affluent children in public schools are relatively isolated from their less affluent peers, while low- and middle-income students experience relatively even peer income distributions. Students from families in the top percentile of the income distribution attend schools where 20 percent of their peers, on average, come from the top five income percentiles. A large majority of the differences in peer exposure that we observe arise from the sorting of students across schools; sorting across classrooms within schools plays a substantially smaller role.View Full Paper PDF
-
Working PaperThere is Such Thing as a Free Lunch: School Meals, Stigma, and Student Discipline🔥
July 2022
Working Paper Number:
CES-22-23R
The Community Eligibility Provision (CEP) allows high-poverty schools to offer free meals to all students regardless of household income. Conceptualizing universal meal provision as a strategy to alleviate stigma associated with school meals, we hypothesize that CEP implementation reduces the incidence of suspensions, particularly for students from low-income backgrounds and minoritized students. We link educational records for students enrolled in Oregon public schools between 2010 and 2017 with administrative data describing their families' household income and social safety net program participation. Difference-in-differences analyses indicate that CEP has protective effects on the probability of suspension for students in participating schools, particularly for students from low-income families, students who received free or reduced-price meals prior to CEP implementation, and Hispanic students.View Full Paper PDF
-
Working PaperSchool Discipline and Racial Disparities in Early Adulthood🔥
June 2021
Working Paper Number:
CES-21-14
Despite interest in the role of school discipline in the creation of racial inequality, previous research has been unable to identify how students who receive suspensions in school differ from unsuspended classmates on key young adult outcomes. We utilize novel data to document the links between high school discipline and important young adult outcomes related to criminal justice contact, social safety net program participation, post-secondary education, and the labor market. We show that the link between school discipline and young adult outcomes tends to be stronger for Black students than for White students, and that inequality in exposure to school discipline accounts for approximately 30 percent of the Black-White disparities in young adult criminal justice outcomes and SNAP receipt.View Full Paper PDF
-
Working PaperSchool-Based Disability Identification Varies by Student Family Income
December 2025
Working Paper Number:
CES-25-74
Currently, 18 percent of K-12 students in the United States receive additional supports through the identification of a disability. Socioeconomic status is viewed as central to understanding who gets identified as having a disability, yet limited large-scale evidence examines how disability identification varies for students from different income backgrounds. Using unique data linking information on Oregon students and their family income, we document pronounced income-based differences in how students are categorized for two school-based disability supports: special education services and Section 504 plans. We find that a quarter of students in the lowest income percentile receive supports through special education, compared with less than seven percent of students in the top income percentile. This pattern may partially reflect differences in underlying disability-related needs caused by poverty. However, we find the opposite pattern for 504 plans, where students in the top income percentiles are two times more likely to receive 504 plan supports. We further document substantial variation in these income-based differences by disability category, by race/ethnicity, and by grade level. Together, these patterns suggest that disability-related needs alone cannot account for the income-based differences that we observe and highlight the complex ways that income shapes the school and family processes that lead to variability in disability classification and services.View Full Paper PDF
-
Working PaperThe Measurement of Medicaid Coverage in the SIPP: Evidence from California, 1990-1996
September 2002
Working Paper Number:
CES-02-21
This paper studies the accuracy of reported Medicaid coverage in the Survey of Income and Program Participation (SIPP) using a unique data set formed by matching SIPP survey responses to administrative records from the State of California. Overall, we estimate that the SIPP underestimates Medicaid coverage in the California populaton by about 10 percent. Among SIPP respondents who can be matched to administrative records, we estimate that the probability someone reports Medicaid coverage in a month when they are actually covered is around 85 percent. The corresponding probability for low-income children is even higher ' at least 90 percent. These estimates suggest that the SIPP provides reasonably accurate coverage reports for those who are actually in the Medicaid system. On the other hand, our estimate of the false positive rate (the rate of reported coverage for those who are not covered in the administrative records) is relatively high: 2.5 percent for the sample as a whole, and up to 20 percent for poor children. Some of this is due to errors in the recording of Social Security numbers in the administrative system, rather than to problems in the SIPP.View Full Paper PDF
-
Working PaperGifted Identification Across the Distribution of Family Income
December 2025
Working Paper Number:
CES-25-73
Currently, 6.1 percent of K-12 students in the United States receive gifted education. Using education and IRS data that provide information on students and their family income, we show pronounced differences in who schools identify as gifted across the distribution of family income. Under 4 percent of students in the lowest income percentile are identified as gifted, compared with 20 percent of those in the top income percentile. Income-based differences persist after accounting for student test scores and exist across students of different sexes and racial/ethnic groups, underscoring the importance of family resources for gifted identification in schools.View Full Paper PDF
-
Working PaperWhere Are Your Parents? Exploring Potential Bias in Administrative Records on Children
March 2024
Working Paper Number:
CES-24-18
This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.View Full Paper PDF
-
Working PaperThe Antipoverty Impact of the EITC: New Estimates from Survey and Administrative Tax Records
April 2019
Working Paper Number:
CES-19-14R
We reassess the antipoverty effects of the EITC using unique data linking the CPS Annual Social and Economic Supplement to IRS data for the same individuals spanning years 2005-2016. We compare EITC benefits from standard simulators to administrative EITC payments and find that significantly more actual EITC payments flow to childless tax units than predicted, and to those whose family income places them above official poverty thresholds. However, actual EITC payments appear to be target efficient at the tax unit level. In 2016, about 3.1 million persons were lifted out of poverty by the EITC, substantially less than prior estimates.View Full Paper PDF