CREAT: Census Research Exploration and Analysis Tool

Coverage and Agreement of Administrative Records and 2010 American Community Survey Demographic Data

November 2014

Working Paper Number:

carra-2014-14

Abstract

The U.S. Census Bureau is researching possible uses of administrative records in decennial census and survey operations. The 2010 Census Match Study and American Community Survey (ACS) Match Study represent recent efforts by the Census Bureau to evaluate the extent to which administrative records provide data on persons and addresses in the 2010 Census and 2010 ACS. The 2010 Census Match Study also examines demographic response data collected in administrative records. Building on this analysis, we match data from the 2010 ACS to federal administrative records and third party data as well as to previous census data and examine administrative records coverage and agreement of ACS age, sex, race, and Hispanic origin responses. We find high levels of coverage and agreement for sex and age responses and variable coverage and agreement across race and Hispanic origin groups. These results are similar to findings from the 2010 Census Match Study.

Document Tags and Keywords

Keywords Keywords are automatically generated using KeyBERT, a powerful and innovative keyword extraction tool that utilizes BERT embeddings to ensure high-quality and contextually relevant keywords.

By analyzing the content of working papers, KeyBERT identifies terms and phrases that capture the essence of the text, highlighting the most significant topics and trends. This approach not only enhances searchability but provides connections that go beyond potentially domain-specific author-defined keywords.
:
data, data census, survey, agency, respondent, ethnicity, ethnic, hispanic, surveys censuses, department, record, federal, matching, population, race, enrollment, census bureau, census file, records census, coverage, ssa, censuses surveys, census survey, datasets, assessing, 2010 census, census records, census responses, census 2020

Tags Tags are automatically generated using a pretrained language model from spaCy, which excels at several tasks, including entity tagging.

The model is able to label words and phrases by part-of-speech, including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are identified to contain references to specific institutions, datasets, and other organizations.
:
Social Security Administration, Administrative Records, Office of Management and Budget, Housing and Urban Development, Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews, Social Security, Department of Housing and Urban Development, American Community Survey, Social Security Number, Protected Identification Key, Computer Assisted Personal Interview, Medicaid Services, Census 2000, Temporary Assistance for Needy Families, 2010 Census, Indian Health Service, Person Validation System, Indian Housing Information Center, Personally Identifiable Information, Some Other Race

Similar Working Papers Similarity between working papers are determined by an unsupervised neural network model know as Doc2Vec.

Doc2Vec is a model that represents entire documents as fixed-length vectors, allowing for the capture of semantic meaning in a way that relates to the context of words within the document. The model learns to associate a unique vector with each document while simultaneously learning word vectors, enabling tasks such as document classification, clustering, and similarity detection by preserving the order and structure of words. The document vectors are compared using cosine similarity/distance to determine the most similar working papers. Papers identified with πŸ”₯ are in the top 20% of similarity.

The 10 most similar working papers to the working paper 'Coverage and Agreement of Administrative Records and 2010 American Community Survey Demographic Data' are listed below in order of similarity.