CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'census research'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

National Science Foundation - 20

Research Data Center - 17

American Community Survey - 14

Social Security Administration - 12

Chicago Census Research Data Center - 12

Cornell University - 11

Current Population Survey - 11

Special Sworn Status - 10

Internal Revenue Service - 10

Center for Economic Studies - 8

Social Security Number - 8

Protected Identification Key - 8

Service Annual Survey - 7

Disclosure Review Board - 7

Survey of Income and Program Participation - 7

Decennial Census - 7

Bureau of Labor Statistics - 7

Longitudinal Employer Household Dynamics - 6

Social Security - 6

Census Bureau Disclosure Review Board - 5

Federal Statistical Research Data Center - 5

Alfred P Sloan Foundation - 5

Master Address File - 5

National Bureau of Economic Research - 5

Longitudinal Business Database - 5

American Economic Association - 5

Standard Industrial Classification - 5

2010 Census - 5

Metropolitan Statistical Area - 5

American Statistical Association - 5

Ordinary Least Squares - 4

Employer Identification Numbers - 4

Quarterly Workforce Indicators - 4

Quarterly Census of Employment and Wages - 4

University of Michigan - 4

Business Register - 4

Standard Statistical Establishment List - 4

North American Industry Classification System - 4

Housing and Urban Development - 4

Administrative Records - 4

Person Validation System - 4

1940 Census - 4

Minnesota Population Center - 4

National Institutes of Health - 3

Geographic Information Systems - 3

Longitudinal Research Database - 3

Individual Characteristics File - 3

Core Based Statistical Area - 3

Business Register Bridge - 3

SSA Numident - 3

Department of Housing and Urban Development - 3

Indian Health Service - 3

Bureau of Economic Analysis - 3

Census 2000 - 3

Person Identification Validation System - 3

Center for Administrative Records Research and Applications - 3

Personally Identifiable Information - 3

National Opinion Research Center - 3

PIKed - 3

Yale University - 3

Viewing papers 1 through 10 of 36


  • Working Paper

    Using Small-Area Estimation (SAE) to Estimate Prevalence of Child Health Outcomes at the Census Regional-, State-, and County-Levels

    November 2022

    Working Paper Number:

    CES-22-48

    In this study, we implement small-area estimation to assess the prevalence of child health outcomes at the county, state, and regional levels, using national survey data.
    View Full Paper PDF
  • Working Paper

    Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning

    November 2021

    Working Paper Number:

    CES-21-35

    This paper considers the problem of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across establishments is highly skewed. To address these difficulties, this paper develops a probabilistic record linkage methodology that combines machine learning (ML) with multiple imputation (MI). This ML-MI methodology is applied to link survey respondents in the Health and Retirement Study to their workplaces in the Census Business Register. The linked data reveal new evidence that non-sampling errors in household survey data are correlated with respondents' workplace characteristics.
    View Full Paper PDF
  • Working Paper

    Validating Abstract Representations of Spatial Population Data while considering Disclosure Avoidance

    February 2020

    Authors: James Gaboardi

    Working Paper Number:

    CES-20-05

    This paper furthers a research agenda for modeling populations along spatial networks and expands upon an empirical analysis to a full U.S. county (Gaboardi, 2019, Ch. 1,2). Specific foci are the necessity of, and methods for, validating and benchmarking spatial data when conducting social science research with aggregated and ambiguous population representations. In order to promote the validation of publicly-available data, access to highly-restricted census microdata was requested, and granted, in order to determine the levels of accuracy and error associated with a network-based population modeling framework. Primary findings reinforce the utility of a novel network allocation method'populated polygons to networks (pp2n) in terms of accuracy, computational complexity, and real runtime (Gaboardi, 2019, Ch. 2). Also, a pseudo-benchmark dataset's performance against the true census microdata shows promise in modeling populations along networks.
    View Full Paper PDF
  • Working Paper

    Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data

    March 2019

    Working Paper Number:

    CES-19-08

    This paper illustrates an application of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across firms is highly asymmetric. To address these difficulties, this paper uses a supervised machine learning model to probabilistically link survey respondents in the Health and Retirement Study (HRS) with employers and establishments in the Census Business Register (BR) to create a new data source which we call the CenHRS. Multiple imputation is used to propagate uncertainty from the linkage step into subsequent analyses of the linked data. The linked data reveal new evidence that survey respondents' misreporting and selective nonresponse about employer characteristics are systematically correlated with wages.
    View Full Paper PDF
  • Working Paper

    Disclosure Avoidance Techniques Used for the 1970 through 2010 Decennial Censuses of Population and Housing

    November 2018

    Authors: Laura McKenna

    Working Paper Number:

    CES-18-47

    The U.S. Census Bureau conducts the decennial censuses under Title 13 of the U. S. Code with the Section 9 mandate to not 'use the information furnished under the provisions of this title for any purpose other than the statistical purposes for which it is supplied; or make any publication whereby the data furnished by any particular establishment or individual under this title can be identified; or permit anyone other than the sworn officers and employees of the Department or bureau or agency thereof to examine the individual reports (13 U.S.C. ' 9 (2007)).' The Census Bureau applies disclosure avoidance techniques to its publicly released statistical products in order to protect the confidentiality of its respondents and their data.
    View Full Paper PDF
  • Working Paper

    LEHD Infrastructure S2014 files in the FSRDC

    September 2018

    Authors: Lars Vilhuber

    Working Paper Number:

    CES-18-27R

    The Longitudinal Employer-Household Dynamics (LEHD) Program at the U.S. Census Bureau, with the support of several national research agencies, maintains a set of infrastructure files using administrative data provided by state agencies, enhanced with information from other administrative data sources, demographic and economic (business) surveys and censuses. The LEHD Infrastructure Files provide a detailed and comprehensive picture of workers, employers, and their interaction in the U.S. economy. This document describes the structure and content of the 2014 Snapshot of the LEHD Infrastructure files as they are made available in the Census Bureau's secure and restricted-access Research Data Center network. The document attempts to provide a comprehensive description of all researcher-accessible files, of their creation, and of any modifications made to the files to facilitate researcher access.
    View Full Paper PDF
  • Working Paper

    The Use of Administrative Records and the American Community Survey to Study the Characteristics of Undercounted Young Children in the 2010 Census

    May 2018

    Working Paper Number:

    carra-2018-05

    Children under age five are historically one of the most difficult segments of the population to enumerate in the U.S. decennial census. The persistent undercount of young children is highest among Hispanics and racial minorities. In this study, we link 2010 Census data to administrative records from government and third party data sources, such as Medicaid enrollment data and tenant rental assistance program records from the Department of Housing and Urban Development, to identify differences between children reported and not reported in the 2010 Census. In addition, we link children in administrative records to the American Community Survey to identify various characteristics of households with children under age five who may have been missed in the last census. This research contributes to what is known about the demographic, socioeconomic, and household characteristics of young children undercounted by the census. Our research also informs the potential benefits of using administrative records and surveys to supplement the U.S. Census Bureau child population enumeration efforts in future decennial censuses.
    View Full Paper PDF
  • Working Paper

    Who are the people in my neighborhood? The 'contextual fallacy' of measuring individual context with census geographies

    February 2018

    Working Paper Number:

    CES-18-11

    Scholars deploy census-based measures of neighborhood context throughout the social sciences and epidemiology. Decades of research confirm that variation in how individuals are aggregated into geographic units to create variables that control for social, economic or political contexts can dramatically alter analyses. While most researchers are aware of the problem, they have lacked the tools to determine its magnitude in the literature and in their own projects. By using confidential access to the complete 2010 U.S. Decennial Census, we are able to construct'for all persons in the US'individual-specific contexts, which we group according to the Census-assigned block, block group, and tract. We compare these individual-specific measures to the published statistics at each scale, and we then determine the magnitude of variation in context for an individual with respect to the published measures using a simple statistic, the standard deviation of individual context (SDIC). For three key measures (percent Black, percent Hispanic, and Entropy'a measure of ethno-racial diversity), we find that block-level Census statistics frequently do not capture the actual context of individuals within them. More problematic, we uncover systematic spatial patterns in the contextual variables at all three scales. Finally, we show that within-unit variation is greater in some parts of the country than in others. We publish county-level estimates of the SDIC statistics that enable scholars to assess whether mis-specification in context variables is likely to alter analytic findings when measured at any of the three common Census units.
    View Full Paper PDF
  • Working Paper

    Has Falling Crime Invited Gentrification?

    January 2017

    Working Paper Number:

    CES-17-27

    Over the past two decades, crime has fallen dramatically in cities in the United States. We explore whether, in the face of falling central city crime rates, households with more resources and options were more likely to move into central cities overall and more particularly into low income and/or majority minority central city neighborhoods. We use confidential, geocoded versions of the 1990 and 2000 Decennial Census and the 2010, 2011, and 2012 American Community Survey to track moves to different neighborhoods in 244 Core Based Statistical Areas (CBSAs) and their largest central cities. Our dataset includes over four million household moves across the three time periods. We focus on three household types typically considered gentrifiers: high-income, college-educated, and white households. We find that declines in city crime are associated with increases in the probability that highincome and college-educated households choose to move into central city neighborhoods, including low-income and majority minority central city neighborhoods. Moreover, we find little evidence that households with lower incomes and without college degrees are more likely to move to cities when violent crime falls. These results hold during the 1990s as well as the 2000s and for the 100 largest metropolitan areas, where crime declines were greatest. There is weaker evidence that white households are disproportionately drawn to cities as crime falls in the 100 largest metropolitan areas from 2000 to 2010.
    View Full Paper PDF
  • Working Paper

    Disconnected Geography: A Spatial Analysis of Disconnected Youth in the United States

    January 2016

    Working Paper Number:

    CES-16-37

    Since the Great Recession, US policy and advocacy groups have sought to better understand its effect on a group of especially vulnerable young adults who are not enrolled in school or training programs and not participating in the labor market, so called 'disconnected youth.' This article distinguishes between disconnected youth and unemployed youth and examines the spatial clustering of these two groups across counties in the US. The focus is to ascertain whether there are differences in underlying contextual factors among groups of counties that are mutually exclusive and spatially disparate (non-adjacent), comprising two types of spatial clusters ' high rates of disconnected youth and high rates of unemployed youth. Using restricted, household-level census data inside the Census Research Data Center (RDC) under special permission by the US Census Bureau, we were able to define these two groups using detailed household questionnaires that are not available to researchers outside the RDC. The geospatial patterns in the two types of clusters suggest that places with high concentrations of disconnected youth are distinctly different in terms of underlying characteristics from places with high concentrations of unemployed youth. These differences include, among other things, arrests for synthetic drug production, enclaves of poor in rural areas, persistent poverty in areas, educational attainment in the populace, children in poverty, persons without health insurance, the social capital index, and elders who receive disability benefits. This article provides some preliminary evidence regarding the social forces underlying the two types of observed geospatial clusters and discusses how they differ.
    View Full Paper PDF