CREAT: Census Research Exploration and Analysis Tool

A METHOD OF CORRECTING FOR MISREPORTING APPLIED TO THE FOOD STAMP PROGRAM

May 2013

Written by: Nikolas Mittag

Working Paper Number:

CES-13-28

Abstract

Survey misreporting is known to be pervasive and bias common statistical analyses. In this paper, I first use administrative data on SNAP receipt and amounts linked to American Community Survey data from New York State to show that survey data can misrepresent the program in important ways. For example, more than 1.4 billion dollars received are not reported in New York State alone. 46 percent of dollars received by house- holds with annual income above the poverty line are not reported in the survey data, while only 19 percent are missing below the poverty line. Standard corrections for measurement error cannot remove these biases. I then develop a method to obtain consistent estimates by combining parameter estimates from the linked data with publicly available data. This conditional density method recovers the correct estimates using public use data only, which solves the problem that access to linked administrative data is usually restricted. I examine the degree to which this approach can be used to extrapolate across time and geography, in order to solve the problem that validation data is often based on a convenience sample. I present evidence from within New York State that the extent of heterogeneity is small enough to make extrapolation work well across both time and geography. Extrapolation to the entire U.S. yields substantive differences to survey data and reduces deviations from official aggregates by a factor of 4 to 9 compared to survey aggregates.

Document Tags and Keywords

Keywords Keywords are automatically generated using KeyBERT, a powerful and innovative keyword extraction tool that utilizes BERT embeddings to ensure high-quality and contextually relevant keywords.

By analyzing the content of working papers, KeyBERT identifies terms and phrases that capture the essence of the text, highlighting the most significant topics and trends. This approach not only enhances searchability but provides connections that go beyond potentially domain-specific author-defined keywords.
:
estimation, econometric, estimating, data, survey data, survey, respondent, estimator, recession, unobserved, poverty, income survey, survey income, assessing, assessed

Tags Tags are automatically generated using a pretrained language model from spaCy, which excels at several tasks, including entity tagging.

The model is able to label words and phrases by part-of-speech, including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are identified to contain references to specific institutions, datasets, and other organizations.
:
Metropolitan Statistical Area, Bureau of Economic Analysis, University of Chicago, Current Population Survey, Department of Agriculture, Survey of Income and Program Participation, American Community Survey, Protected Identification Key, National Opinion Research Center, Public Use Micro Sample, Disclosure Review Board

Similar Working Papers Similarity between working papers are determined by an unsupervised neural network model know as Doc2Vec.

Doc2Vec is a model that represents entire documents as fixed-length vectors, allowing for the capture of semantic meaning in a way that relates to the context of words within the document. The model learns to associate a unique vector with each document while simultaneously learning word vectors, enabling tasks such as document classification, clustering, and similarity detection by preserving the order and structure of words. The document vectors are compared using cosine similarity/distance to determine the most similar working papers. Papers identified with 🔥 are in the top 20% of similarity.

The 10 most similar working papers to the working paper 'A METHOD OF CORRECTING FOR MISREPORTING APPLIED TO THE FOOD STAMP PROGRAM' are listed below in order of similarity.