CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'data census'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Internal Revenue Service - 30

American Community Survey - 28

Social Security Administration - 25

Current Population Survey - 24

Center for Economic Studies - 23

National Science Foundation - 20

Bureau of Labor Statistics - 19

Protected Identification Key - 19

Census Bureau Disclosure Review Board - 18

Social Security Number - 18

Service Annual Survey - 18

Longitudinal Employer Household Dynamics - 18

North American Industry Classification System - 17

Research Data Center - 17

Employer Identification Numbers - 16

Master Address File - 16

Decennial Census - 16

Business Register - 16

Disclosure Review Board - 15

Longitudinal Business Database - 14

Census Bureau Business Register - 14

Survey of Income and Program Participation - 14

Standard Industrial Classification - 14

Cornell University - 14

Annual Survey of Manufactures - 13

Federal Statistical Research Data Center - 13

Social Security - 13

2010 Census - 12

Person Validation System - 12

Standard Statistical Establishment List - 11

Economic Census - 11

Housing and Urban Development - 10

Quarterly Workforce Indicators - 10

Quarterly Census of Employment and Wages - 9

Alfred P Sloan Foundation - 9

National Opinion Research Center - 8

Department of Housing and Urban Development - 8

Metropolitan Statistical Area - 8

American Housing Survey - 8

Supplemental Nutrition Assistance Program - 7

Census Numident - 7

Person Identification Validation System - 7

Computer Assisted Personal Interview - 7

Business Dynamics Statistics - 7

Census of Manufactures - 6

Individual Taxpayer Identification Numbers - 6

Administrative Records - 6

Longitudinal Research Database - 6

Indian Health Service - 6

Unemployment Insurance - 6

Local Employment Dynamics - 6

Census 2000 - 6

Center for Administrative Records Research and Applications - 6

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 5

Medicaid Services - 5

MAFID - 5

Census Bureau Person Identification Validation System - 5

SSA Numident - 5

Geographic Information Systems - 5

Business Employment Dynamics - 5

Federal Reserve Bank - 5

Federal Tax Information - 5

American Statistical Association - 5

Bureau of Economic Analysis - 5

Permanent Plant Number - 5

Agency for Healthcare Research and Quality - 5

Health and Retirement Study - 4

Total Factor Productivity - 4

Department of Labor - 4

Federal Reserve System - 4

National Institute on Aging - 4

County Business Patterns - 4

Company Organization Survey - 4

Cornell Institute for Social and Economic Research - 4

PIKed - 4

Indian Housing Information Center - 4

Personally Identifiable Information - 4

National Bureau of Economic Research - 4

University of Chicago - 4

Postal Service - 4

Probability Density Function - 4

American Economic Association - 4

Business Master File - 4

Employment History File - 4

Employer Characteristics File - 4

Individual Characteristics File - 4

Core Based Statistical Area - 4

Business Register Bridge - 4

Successor Predecessor File - 4

Chicago Census Research Data Center - 4

Census Bureau Longitudinal Business Database - 4

CATI - 4

Some Other Race - 4

Establishment Micro Properties - 4

Department of Agriculture - 3

Centers for Medicare - 3

1940 Census - 3

Census Bureau Master Address File - 3

W-2 - 3

Temporary Assistance for Needy Families - 3

Accommodation and Food Services - 3

Social Science Research Institute - 3

MAF-ARF - 3

Ordinary Least Squares - 3

Characteristics of Business Owners - 3

Retail Trade - 3

Small Business Administration - 3

Department of Homeland Security - 3

Special Sworn Status - 3

Sloan Foundation - 3

Wholesale Trade - 3

University of Maryland - 3

Bureau of Labor - 3

Journal of Labor Economics - 3

Composite Person Record - 3

North American Industry Classi - 3

Duke University - 3

Office of Management and Budget - 3

CDF - 3

Cumulative Density Function - 3

Medical Expenditure Panel Survey - 3

Financial, Insurance and Real Estate Industries - 3

census bureau - 36

survey - 35

census data - 33

respondent - 32

data - 31

population - 27

agency - 20

statistical - 19

microdata - 19

report - 18

use census - 16

datasets - 16

record - 16

estimating - 14

census survey - 14

census research - 13

research census - 13

employed - 12

workforce - 10

resident - 10

economic census - 10

payroll - 9

census employment - 9

employee - 9

database - 9

statistician - 9

aggregate - 9

employ - 8

labor - 8

coverage - 8

censuses surveys - 8

researcher - 8

disclosure - 8

assessed - 7

information census - 7

recession - 7

quarterly - 7

longitudinal - 7

sector - 7

study - 7

hispanic - 6

sampling - 6

assessing - 6

expenditure - 6

linked census - 6

census years - 6

residential - 6

provided census - 6

estimation - 6

confidentiality - 6

2010 census - 6

econometric - 6

work census - 6

yearly - 6

ethnicity - 6

census file - 6

race census - 6

matching - 6

trend - 5

earnings - 5

household surveys - 5

disparity - 5

minority - 5

citizen - 5

survey data - 5

privacy - 5

neighborhood - 5

census records - 5

imputation - 5

census business - 5

metropolitan - 5

employment data - 5

business data - 5

records census - 5

employment statistics - 5

research - 5

race - 5

average - 4

sample - 4

labor statistics - 4

ssa - 4

prevalence - 4

population survey - 4

estimator - 4

housing - 4

linkage - 4

enterprise - 4

census use - 4

macroeconomic - 4

geography - 4

geographic - 4

surveys censuses - 4

reporting - 4

information - 4

publicly - 4

department - 4

worker - 4

employer household - 4

employee data - 4

ethnic - 4

census responses - 4

analysis - 4

aggregation - 4

identifier - 4

federal - 4

aging - 4

economist - 4

revenue - 3

percentile - 3

occupation - 3

survey households - 3

medicaid - 3

impact - 3

amenity - 3

census linked - 3

survey income - 3

incorporated - 3

businesses census - 3

salary - 3

workforce indicators - 3

geographically - 3

establishment - 3

public - 3

workplace - 3

employment dynamics - 3

clerical - 3

worker demographics - 3

longitudinal employer - 3

white - 3

racial - 3

irs - 3

bias - 3

enrollment - 3

job - 3

Viewing papers 1 through 10 of 58


  • Working Paper

    Optimal Stratified Sampling for Probability-Based Online Panels

    September 2025

    Working Paper Number:

    CES-25-69

    Online probability-based panels have emerged as a cost-efficient means of conducting surveys in the 21st century. While there have been various recent advancements in sampling techniques for online panels, several critical aspects of sampling theory for online panels are lacking. Much of current sampling theory from the middle of the 20th century, when response rates were high, and online panels did not exist. This paper presents a mathematical model of stratified sampling for online panels that takes into account historical response rates and survey costs. Through some simplifying assumptions, the model shows that the optimal sample allocation for online panels can largely resemble the solution for a cross-sectional survey. To apply the model, I use the Census Household Panel to show how this method could improve the average precision of key estimates. Holding fielding costs constant, the new sample rates improve the average precision of estimates between 1.47 and 17.25 percent, depending on the importance weight given to an overall population mean compared to mean estimates for racial and ethnic subgroups.
    View Full Paper PDF
  • Working Paper

    Job Tasks, Worker Skills, and Productivity

    September 2025

    Working Paper Number:

    CES-25-63

    We present new empirical evidence suggesting that we can better understand productivity dispersion across businesses by accounting for differences in how tasks, skills, and occupations are organized. This aligns with growing attention to the task content of production. We link establishment-level data from the Bureau of Labor Statistics Occupational Employment and Wage Statistics survey with productivity data from the Census Bureau's manufacturing surveys. Our analysis reveals strong relationships between establishment productivity and task, skill, and occupation inputs. These relationships are highly nonlinear and vary by industry. When we account for these patterns, we can explain a substantial share of productivity dispersion across establishments.
    View Full Paper PDF
  • Working Paper

    The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey

    February 2025

    Working Paper Number:

    CES-25-13

    The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.
    View Full Paper PDF
  • Working Paper

    The Census Historical Environmental Impacts Frame

    October 2024

    Working Paper Number:

    CES-24-66

    The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
    View Full Paper PDF
  • Working Paper

    Nonresponse and Coverage Bias in the Household Pulse Survey: Evidence from Administrative Data

    October 2024

    Working Paper Number:

    CES-24-60

    The Household Pulse Survey (HPS) conducted by the U.S. Census Bureau is a unique survey that provided timely data on the effects of the COVID-19 Pandemic on American households and continues to provide data on other emergent social and economic issues. Because the survey has a response rate in the single digits and only has an online response mode, there are concerns about nonresponse and coverage bias. In this paper, we match administrative data from government agencies and third-party data to HPS respondents to examine how representative they are of the U.S. population. For comparison, we create a benchmark of American Community Survey (ACS) respondents and nonrespondents and include the ACS respondents as another point of reference. Overall, we find that the HPS is less representative of the U.S. population than the ACS. However, performance varies across administrative variables, and the existing weighting adjustments appear to greatly improve the representativeness of the HPS. Additionally, we look at household characteristics by their email domain to examine the effects on coverage from limiting email messages in 2023 to addresses from the contact frame with at least 90% deliverability rates, finding no clear change in the representativeness of the HPS afterwards.
    View Full Paper PDF
  • Working Paper

    Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey

    January 2024

    Working Paper Number:

    CES-24-02

    Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations. After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics. This paper is for research purposes only. No changes to production are being implemented at this time.
    View Full Paper PDF
  • Working Paper

    Building the Prototype Census Environmental Impacts Frame

    April 2023

    Working Paper Number:

    CES-23-20

    The natural environment is central to all aspects of life, but efforts to quantify its influence have been hindered by data availability and measurement constraints. To mitigate some of these challenges, we introduce a new prototype of a microdata infras tructure: the Census Environmental Impacts Frame (EIF). The EIF provides detailed individual-level information on demographics, economic characteristics, and address level histories ' linked to spatially and temporally resolved estimates of environmental conditions for each individual ' for almost every resident in the United States over the past two decades. This linked microdata infrastructure provides a unique platform for advancing our understanding about the distribution of environmental amenities and hazards, when, how, and why exposures have evolved over time, and the consequences of environmental inequality and changing environmental conditions. We describe the construction of the EIF, explore issues of coverage and data quality, document patterns and trends in individual exposure to two correlated but distinct air pollutants as an application of the EIF, and discuss implications and opportunities for future research.
    View Full Paper PDF
  • Working Paper

    Comparing the 2019 American Housing Survey to Contemporary Sources of Property Tax Records: Implications for Survey Efficiency and Quality

    June 2022

    Working Paper Number:

    CES-22-22

    Given rising nonresponse rates and concerns about respondent burden, government statistical agencies have been exploring ways to supplement household survey data collection with administrative records and other sources of third-party data. This paper evaluates the potential of property tax assessment records to improve housing surveys by comparing these records to responses from the 2019 American Housing Survey. Leveraging the U.S. Census Bureau's linkage infrastructure, we compute the fraction of AHS housing units that could be matched to a unique property parcel (coverage rate), as well as the extent to which survey and property tax data contain the same information (agreement rate). We analyze heterogeneity in coverage and agreement across states, housing characteristics, and 11 AHS items of interest to housing researchers. Our results suggest that partial replacement of AHS data with property data, targeted toward certain survey items or single-family detached homes, could reduce respondent burden without altering data quality. Further research into partial-replacement designs is needed and should proceed on an item-by-item basis. Our work can guide this research as well as those who wish to conduct independent research with property tax records that is representative of the U.S. housing stock.
    View Full Paper PDF
  • Working Paper

    Improving Estimates of Neighborhood Change with Constant Tract Boundaries

    May 2022

    Working Paper Number:

    CES-22-16

    Social scientists routinely rely on methods of interpolation to adjust available data to their research needs. This study calls attention to the potential for substantial error in efforts to harmonize data to constant boundaries using standard approaches to areal and population interpolation. We compare estimates from a standard source (the Longitudinal Tract Data Base) to true values calculated by re-aggregating original 2000 census microdata to 2010 tract areas. We then demonstrate an alternative approach that allows the re-aggregated values to be publicly disclosed, using 'differential privacy' (DP) methods to inject random noise to protect confidentiality of the raw data. The DP estimates are considerably more accurate than the interpolated estimates. We also examine conditions under which interpolation is more susceptible to error. This study reveals cause for greater caution in the use of interpolated estimates from any source. Until and unless DP estimates can be publicly disclosed for a wide range of variables and years, research on neighborhood change should routinely examine data for signs of estimation error that may be substantial in a large share of tracts that experienced complex boundary changes.
    View Full Paper PDF
  • Working Paper

    Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning

    November 2021

    Working Paper Number:

    CES-21-35

    This paper considers the problem of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across establishments is highly skewed. To address these difficulties, this paper develops a probabilistic record linkage methodology that combines machine learning (ML) with multiple imputation (MI). This ML-MI methodology is applied to link survey respondents in the Health and Retirement Study to their workplaces in the Census Business Register. The linked data reveal new evidence that non-sampling errors in household survey data are correlated with respondents' workplace characteristics.
    View Full Paper PDF