CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'data'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

National Science Foundation - 36

Internal Revenue Service - 36

American Community Survey - 35

Center for Economic Studies - 35

Social Security Administration - 29

Service Annual Survey - 29

Research Data Center - 27

Current Population Survey - 24

Protected Identification Key - 22

Bureau of Labor Statistics - 22

Longitudinal Employer Household Dynamics - 20

North American Industry Classification System - 20

Cornell University - 20

Survey of Income and Program Participation - 19

Census Bureau Disclosure Review Board - 18

2010 Census - 18

Decennial Census - 17

Economic Census - 17

Social Security Number - 16

Person Validation System - 16

Master Address File - 16

Business Register - 16

Longitudinal Business Database - 16

Social Security - 15

Employer Identification Numbers - 14

Standard Industrial Classification - 14

Quarterly Workforce Indicators - 13

Disclosure Review Board - 12

Center for Administrative Records Research and Applications - 12

Special Sworn Status - 12

Person Identification Validation System - 11

Personally Identifiable Information - 11

Administrative Records - 11

Bureau of Economic Analysis - 11

Housing and Urban Development - 10

Census Bureau Business Register - 10

Alfred P Sloan Foundation - 10

Annual Survey of Manufactures - 10

Longitudinal Research Database - 10

National Opinion Research Center - 10

Department of Housing and Urban Development - 9

Indian Health Service - 9

National Center for Health Statistics - 9

Standard Statistical Establishment List - 9

County Business Patterns - 9

Business Dynamics Statistics - 9

Chicago Census Research Data Center - 9

MAFID - 8

SSA Numident - 8

Federal Statistical Research Data Center - 8

Computer Assisted Personal Interview - 7

Statistics Canada - 7

Quarterly Census of Employment and Wages - 7

Metropolitan Statistical Area - 7

Duke University - 7

American Statistical Association - 7

Public Use Micro Sample - 7

Census Bureau Master Address File - 6

Individual Taxpayer Identification Numbers - 6

Indian Housing Information Center - 6

Agency for Healthcare Research and Quality - 6

American Housing Survey - 6

Company Organization Survey - 6

DOB - 6

Unemployment Insurance - 6

Medicaid Services - 6

Census of Manufactures - 6

Postal Service - 6

LEHD Program - 6

Supplemental Nutrition Assistance Program - 5

Sloan Foundation - 5

Census Numident - 5

Census Bureau Person Identification Validation System - 5

Some Other Race - 5

National Institute on Aging - 5

University of Michigan - 5

Small Business Administration - 5

Office of Management and Budget - 5

Cornell Institute for Social and Economic Research - 5

PIKed - 5

University of Chicago - 5

American Economic Association - 5

Federal Reserve Bank - 5

National Bureau of Economic Research - 5

Local Employment Dynamics - 5

Permanent Plant Number - 5

Journal of Economic Literature - 5

Ordinary Least Squares - 4

1940 Census - 4

W-2 - 4

Census Edited File - 4

National Institutes of Health - 4

Health and Retirement Study - 4

National Longitudinal Survey of Youth - 4

Census of Manufacturing Firms - 4

Probability Density Function - 4

Minnesota Population Center - 4

Center for Administrative Records Research - 4

Organization for Economic Cooperation and Development - 4

Characteristics of Business Owners - 4

Total Factor Productivity - 4

Federal Insurance Contribution Act - 3

Social and Economic Supplement - 3

ASEC - 3

Adjusted Gross Income - 3

Temporary Assistance for Needy Families - 3

Geographic Information Systems - 3

Department of Economics - 3

COVID-19 - 3

National Income and Product Accounts - 3

Bureau of Labor - 3

Centers for Medicare - 3

Census Bureau Longitudinal Business Database - 3

Centers for Disease Control and Prevention - 3

Employer Characteristics File - 3

Department of Health and Human Services - 3

National Research Council - 3

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 3

CATI - 3

Census Bureau Center for Economic Studies - 3

Census 2000 - 3

Office of Personnel Management - 3

Census Bureau Business Dynamics Statistics - 3

COMPUSTAT - 3

Securities and Exchange Commission - 3

survey - 53

respondent - 44

statistical - 43

microdata - 41

datasets - 38

census bureau - 36

record - 36

agency - 35

data census - 31

census data - 26

estimating - 23

population - 23

database - 23

report - 22

disclosure - 19

analysis - 19

statistician - 17

confidentiality - 16

matching - 16

research - 16

survey data - 15

information - 15

privacy - 15

imputation - 15

researcher - 15

aggregate - 14

use census - 12

census survey - 12

census research - 12

statistical agencies - 12

payroll - 11

study - 11

estimation - 10

earnings - 10

sampling - 10

sample - 10

coverage - 10

public - 10

records census - 10

linkage - 10

employee - 10

workforce - 10

research census - 10

publicly - 9

resident - 9

identifier - 9

census records - 9

quarterly - 9

economic census - 9

business data - 9

matched - 9

economist - 9

sector - 9

2010 census - 8

assessed - 8

federal - 8

statistical disclosure - 8

employed - 8

enterprise - 8

longitudinal - 8

employment data - 8

employee data - 8

ssa - 7

census years - 7

residential - 7

residence - 7

household surveys - 7

reporting - 7

census use - 7

aggregation - 7

inference - 7

associate - 7

econometric - 7

estimator - 6

enrollment - 6

irs - 6

income data - 6

ethnicity - 6

race census - 6

census employment - 6

department - 6

work census - 6

information census - 6

recession - 6

surveys censuses - 6

percentile - 6

censuses surveys - 6

sale - 6

expenditure - 6

employ - 6

model - 6

census file - 6

industrial - 6

minority - 5

salary - 5

census linked - 5

citizen - 5

provided census - 5

race - 5

state - 5

housing - 5

assessing - 5

housing survey - 5

establishments data - 5

market - 5

analyst - 5

social - 5

worker - 5

manufacturing - 5

macroeconomic - 5

average - 4

survey income - 4

population survey - 4

census disclosure - 4

income individuals - 4

tax - 4

taxpayer - 4

geographic - 4

linked census - 4

survey households - 4

hispanic - 4

census 2020 - 4

home - 4

individuals census - 4

imputation model - 4

incorporated - 4

policymakers - 4

gdp - 4

employment statistics - 4

establishment - 4

investment - 4

labor - 4

trend - 4

earner - 3

household income - 3

1040 - 3

environmental - 3

impact - 3

disparity - 3

discrepancy - 3

racial - 3

empirical - 3

classification - 3

prevalence - 3

apartment - 3

unobserved - 3

organizational - 3

acquisition - 3

economic statistics - 3

classifying - 3

employer household - 3

imputed - 3

ancestry - 3

ethnic - 3

bias - 3

census responses - 3

worker demographics - 3

production - 3

manufacturer - 3

inventory - 3

employment dynamics - 3

workforce indicators - 3

classified - 3

measures employment - 3

employment measures - 3

firm data - 3

company - 3

Viewing papers 11 through 20 of 94


  • Working Paper

    Comparing the 2019 American Housing Survey to Contemporary Sources of Property Tax Records: Implications for Survey Efficiency and Quality

    June 2022

    Working Paper Number:

    CES-22-22

    Given rising nonresponse rates and concerns about respondent burden, government statistical agencies have been exploring ways to supplement household survey data collection with administrative records and other sources of third-party data. This paper evaluates the potential of property tax assessment records to improve housing surveys by comparing these records to responses from the 2019 American Housing Survey. Leveraging the U.S. Census Bureau's linkage infrastructure, we compute the fraction of AHS housing units that could be matched to a unique property parcel (coverage rate), as well as the extent to which survey and property tax data contain the same information (agreement rate). We analyze heterogeneity in coverage and agreement across states, housing characteristics, and 11 AHS items of interest to housing researchers. Our results suggest that partial replacement of AHS data with property data, targeted toward certain survey items or single-family detached homes, could reduce respondent burden without altering data quality. Further research into partial-replacement designs is needed and should proceed on an item-by-item basis. Our work can guide this research as well as those who wish to conduct independent research with property tax records that is representative of the U.S. housing stock.
    View Full Paper PDF
  • Working Paper

    Improving Estimates of Neighborhood Change with Constant Tract Boundaries

    May 2022

    Working Paper Number:

    CES-22-16

    Social scientists routinely rely on methods of interpolation to adjust available data to their research needs. This study calls attention to the potential for substantial error in efforts to harmonize data to constant boundaries using standard approaches to areal and population interpolation. We compare estimates from a standard source (the Longitudinal Tract Data Base) to true values calculated by re-aggregating original 2000 census microdata to 2010 tract areas. We then demonstrate an alternative approach that allows the re-aggregated values to be publicly disclosed, using 'differential privacy' (DP) methods to inject random noise to protect confidentiality of the raw data. The DP estimates are considerably more accurate than the interpolated estimates. We also examine conditions under which interpolation is more susceptible to error. This study reveals cause for greater caution in the use of interpolated estimates from any source. Until and unless DP estimates can be publicly disclosed for a wide range of variables and years, research on neighborhood change should routinely examine data for signs of estimation error that may be substantial in a large share of tracts that experienced complex boundary changes.
    View Full Paper PDF
  • Working Paper

    Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning

    November 2021

    Working Paper Number:

    CES-21-35

    This paper considers the problem of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across establishments is highly skewed. To address these difficulties, this paper develops a probabilistic record linkage methodology that combines machine learning (ML) with multiple imputation (MI). This ML-MI methodology is applied to link survey respondents in the Health and Retirement Study to their workplaces in the Census Business Register. The linked data reveal new evidence that non-sampling errors in household survey data are correlated with respondents' workplace characteristics.
    View Full Paper PDF
  • Working Paper

    Developing Content for the Management and Organizational Practices Survey-Hospitals (MOPS-HP)

    September 2021

    Working Paper Number:

    CES-21-25

    Nationally representative U.S. hospital data does not exist on management practices, which have been shown to be related to both clinical and financial performance using past data collected in the World Management Survey (WMS). This paper describes the U.S. Census Bureau's development of content for the Management and Organizational Practices Survey Hospitals (MOPS-HP) that is similar to data collected in the MOPS conducted for the manufacturing sector in 2010 and 2015 and the 2009 WMS. Findings from cognitive testing interviews with 18 chief nursing officers and 13 chief financial officers at 30 different hospitals across 7 states and the District of Columbia led to using industry-tested terminology, to confirming chief nursing officers as MOPS-HP respondents and their ability to provide recall data, and to eliminating questions that tested poorly. Hospital data collected in the MOPS-HP would be the first nationally representative data on management practices with queries on clinical key performance indicators, financial and hospital-wide patient care goals, addressing patient care problems, clinical team interactions and staffing, standardized clinical protocols, and incentives for medical record documentation. The MOPS-HP's purpose is not to collect COVID-19 pandemic information; however, data measuring hospital management practices prior to and during the COVID-19 pandemic are a byproduct of the survey's one-year recall period (2019 and 2020).
    View Full Paper PDF
  • Working Paper

    Redesigning the Longitudinal Business Database

    May 2021

    Working Paper Number:

    CES-21-08

    In this paper we describe the U.S. Census Bureau's redesign and production implementation of the Longitudinal Business Database (LBD) first introduced by Jarmin and Miranda (2002). The LBD is used to create the Business Dynamics Statistics (BDS), tabulations describing the entry, exit, expansion, and contraction of businesses. The new LBD and BDS also incorporate information formerly provided by the Statistics of U.S. Businesses program, which produced similar year-to-year measures of employment and establishment flows. We describe in detail how the LBD is created from curation of the input administrative data, longitudinal matching, retiming of economic census-year births and deaths, creation of vintage consistent industry codes and noise factors, and the creation and cleaning of each year of LBD data. This documentation is intended to facilitate the proper use and understanding of the data by both researchers with approved projects accessing the LBD microdata and those using the BDS tabulations.
    View Full Paper PDF
  • Working Paper

    Measuring the Impact of COVID-19 on Businesses and People: Lessons from the Census Bureau's Experience

    January 2021

    Working Paper Number:

    CES-21-02

    We provide an overview of Census Bureau activities to enhance the consistency, timeliness, and relevance of our data products in response to the COVID-19 pandemic. We highlight new data products designed to provide timely and granular information on the pandemic's impact: the Small Business Pulse Survey, weekly Business Formation Statistics, the Household Pulse Survey, and Community Resilience Estimates. We describe pandemic-related content introduced to existing surveys such as the Annual Business Survey and the Current Population Survey. We discuss adaptations to ensure the continuity and consistency of existing data products such as principal economic indicators and the American Community Survey.
    View Full Paper PDF
  • Working Paper

    Determination of the 2020 U.S. Citizen Voting Age Population (CVAP) Using Administrative Records and Statistical Methodology Technical Report

    October 2020

    Working Paper Number:

    CES-20-33

    This report documents the efforts of the Census Bureau's Citizen Voting-Age Population (CVAP) Internal Expert Panel (IEP) and Technical Working Group (TWG) toward the use of multiple data sources to produce block-level statistics on the citizen voting-age population for use in enforcing the Voting Rights Act. It describes the administrative, survey, and census data sources used, and the four approaches developed for combining these data to produce CVAP estimates. It also discusses other aspects of the estimation process, including how records were linked across the multiple data sources, and the measures taken to protect the confidentiality of the data.
    View Full Paper PDF
  • Working Paper

    Matching State Business Registration Records to Census Business Data

    January 2020

    Working Paper Number:

    CES-20-03

    We describe our methodology and results from matching state Business Registration Records (BRR) to Census business data. We use data from Massachusetts and California to develop methods and preliminary results that could be used to guide matching data for additional states. We obtain matches to Census business records for 45% of the Massachusetts BRR records and 40% of the California BRR records. We find higher match rates for incorporated businesses and businesses with higher startup-quality scores as assigned in Guzman and Stern (2018). Clerical reviews show that using relatively strict matching on address is important for match accuracy, while results are less sensitive to name matching strictness. Among matched BRR records, the modal timing of the first match to the BR is in the year in which the BRR record was filed. We use two sets of software to identify matches: SAS DQ Match and a machine-learning algorithm described in Cuffe and Goldschlag (2018). We find preliminary evidence that while the ML-based method yields more match results, SAS DQ tends to result in higher accuracy rates. To conclude, we provide suggestions on how to proceed with matching other states' data in light of our findings using these two states.
    View Full Paper PDF
  • Working Paper

    Re-engineering Key National Economic Indicators

    July 2019

    Working Paper Number:

    CES-19-22

    Traditional methods of collecting data from businesses and households face increasing challenges. These include declining response rates to surveys, increasing costs to traditional modes of data collection, and the difficulty of keeping pace with rapid changes in the economy. The digitization of virtually all market transactions offers the potential for re-engineering key national economic indicators. The challenge for the statistical system is how to operate in this data-rich environment. This paper focuses on the opportunities for collecting item-level data at the source and constructing key indicators using measurement methods consistent with such a data infrastructure. Ubiquitous digitization of transactions allows price and quantity be collected or aggregated simultaneously at the source. This new architecture for economic statistics creates challenges arising from the rapid change in items sold. The paper explores some recently proposed techniques for estimating price and quantity indices in large scale item-level data. Although those methods display tremendous promise, substantially more research is necessary before they will be ready to serve as the basis for the official economic statistics. Finally, the paper addresses implications for building national statistics from transactions for data collection and for the capabilities and organization of the statistical agencies in the 21st century.
    View Full Paper PDF
  • Working Paper

    Releasing Earnings Distributions using Differential Privacy: Disclosure Avoidance System For Post Secondary Employment Outcomes (PSEO)

    April 2019

    Working Paper Number:

    CES-19-13

    The U.S. Census Bureau recently released data on earnings percentiles of graduates from post secondary institutions. This paper describes and evaluates the disclosure avoidance system developed for these statistics. We propose a differentially private algorithm for releasing these data based on standard differentially private building blocks, by constructing a histogram of earnings and the application of the Laplace mechanism to recover a differentially-private CDF of earnings. We demonstrate that our algorithm can release earnings distributions with low error, and our algorithm out-performs prior work based on the concept of smooth sensitivity from Nissim, Raskhodnikova and Smith (2007).
    View Full Paper PDF