CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'census data'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

American Community Survey - 39

Internal Revenue Service - 32

Protected Identification Key - 32

Current Population Survey - 29

Social Security Number - 29

Social Security Administration - 29

Decennial Census - 26

2020 Census - 25

Center for Economic Studies - 20

Research Data Center - 19

Master Address File - 18

Census Bureau Disclosure Review Board - 18

Disclosure Review Board - 18

Bureau of Labor Statistics - 18

Longitudinal Employer Household Dynamics - 18

Person Validation System - 17

Survey of Income and Program Participation - 16

Service Annual Survey - 16

Social Security - 15

National Science Foundation - 15

Employer Identification Number - 13

Business Register - 13

Standard Statistical Establishment List - 13

Cornell University - 13

Standard Industrial Classification - 13

North American Industry Classification System - 13

Administrative Records - 12

Personally Identifiable Information - 12

Person Identification Validation System - 11

Housing and Urban Development - 11

Federal Statistical Research Data Center - 11

Economic Census - 11

1990 Census - 10

SSA Numident - 10

Metropolitan Statistical Area - 10

American Housing Survey - 10

Longitudinal Business Database - 10

Individual Taxpayer Identification Numbers - 9

Alfred P Sloan Foundation - 9

Census Numident - 8

Department of Housing and Urban Development - 8

Census Bureau Business Register - 8

Supplemental Nutrition Assistance Program - 8

MAFID - 8

Indian Health Service - 8

Some Other Race - 8

Annual Survey of Manufactures - 8

Computer Assisted Personal Interview - 7

Quarterly Census of Employment and Wages - 7

Office of Management and Budget - 7

Ordinary Least Squares - 7

Quarterly Workforce Indicators - 7

National Opinion Research Center - 7

Unemployment Insurance - 7

Cornell Institute for Social and Economic Research - 6

Medicaid Services - 6

County Business Patterns - 6

Business Dynamics Statistics - 6

American Economic Association - 6

Bureau of Economic Analysis - 6

Center for Administrative Records Research and Applications - 6

Data Management System - 5

Temporary Assistance for Needy Families - 5

Census Bureau Person Identification Validation System - 5

Indian Housing Information Center - 5

Census Edited File - 5

Special Sworn Status - 5

Business Employment Dynamics - 5

Core Based Statistical Area - 5

PIKed - 5

Census 2000 - 5

Business Master File - 5

Business Register Bridge - 5

American Statistical Association - 5

Federal Reserve Bank - 5

Financial, Insurance and Real Estate Industries - 5

W-2 - 4

Social Science Research Institute - 4

National Longitudinal Survey of Youth - 4

Statistics Canada - 4

Postal Service - 4

Department of Homeland Security - 4

Urban Institute - 4

University of Maryland - 4

Sloan Foundation - 4

Individual Characteristics File - 4

Employer Characteristics File - 4

Employment History File - 4

Local Employment Dynamics - 4

National Institute on Aging - 4

National Center for Health Statistics - 4

Securities and Exchange Commission - 4

National Bureau of Economic Research - 4

Establishment Micro Properties - 4

Agency for Healthcare Research and Quality - 4

LEHD Program - 4

Adjusted Gross Income - 3

Census Bureau Master Address File - 3

Master Beneficiary Record - 3

Disability Insurance - 3

Census Household Composition Key - 3

MAF-ARF - 3

General Education Development - 3

New England County Metropolitan - 3

Health and Retirement Study - 3

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 3

Department of Justice - 3

Centers for Medicare - 3

Yale University - 3

Department of Health and Human Services - 3

National Institutes of Health - 3

Geographic Information Systems - 3

Small Business Administration - 3

Longitudinal Research Database - 3

Harvard University - 3

Journal of Labor Economics - 3

Composite Person Record - 3

North American Industry Classi - 3

Chicago Census Research Data Center - 3

Census of Manufactures - 3

Economic Research Service - 3

Minnesota Population Center - 3

Organization for Economic Cooperation and Development - 3

Department of Labor - 3

General Accounting Office - 3

CDF - 3

Permanent Plant Number - 3

Medical Expenditure Panel Survey - 3

survey - 40

population - 34

household - 34

census bureau - 33

data census - 31

respondent - 27

data - 26

resident - 20

use census - 19

agency - 17

record - 17

microdata - 15

statistical - 15

census research - 15

census survey - 14

citizen - 14

datasets - 14

ethnicity - 14

residence - 13

economic census - 13

report - 12

residential - 12

hispanic - 12

housing - 12

census use - 12

research census - 12

estimating - 10

census records - 10

neighborhood - 10

disparity - 9

census linked - 9

minority - 9

matching - 9

database - 9

information census - 9

disadvantaged - 9

payroll - 9

census years - 8

records census - 8

immigrant - 8

socioeconomic - 8

longitudinal - 8

workforce - 8

linked census - 7

imputation - 7

irs - 7

2010 census - 7

metropolitan - 7

poverty - 7

employed - 7

employee - 7

census file - 7

coverage - 6

race - 6

assessed - 6

linkage - 6

racial - 6

enrollment - 6

census response - 6

work census - 6

censuses surveys - 6

employ - 6

employer household - 6

ethnic - 6

migration - 6

expenditure - 6

household survey - 5

sampling - 5

identifier - 5

federal - 5

race census - 5

disclosure - 5

family - 5

census employment - 5

immigration - 5

confidentiality - 5

rural - 5

enterprise - 5

statistician - 5

longitudinal employer - 5

ancestry - 5

migrant - 5

census business - 5

labor - 5

aging - 5

medicaid - 4

native - 4

bias - 4

census household - 4

percentile - 4

tax - 4

unemployed - 4

survey income - 4

analysis - 4

information - 4

prevalence - 4

state - 4

privacy - 4

quarterly - 4

recession - 4

employment data - 4

urban - 4

geographic - 4

researcher - 4

research - 4

employment dynamics - 4

employment statistics - 4

employee data - 4

revenue - 4

aggregate - 4

matched - 4

white - 4

impact - 3

environmental - 3

amenity - 3

intergenerational - 3

black - 3

estimator - 3

citizenship - 3

1040 - 3

segregation - 3

child - 3

assessing - 3

yearly - 3

business data - 3

businesses census - 3

geographically - 3

urbanization - 3

suburb - 3

district - 3

survey census - 3

community - 3

decade - 3

workplace - 3

worker - 3

clerical - 3

surveys censuses - 3

residing - 3

firms census - 3

ssa - 3

study - 3

demography - 3

migrating - 3

associate - 3

econometric - 3

Viewing papers 1 through 10 of 68


  • Working Paper

    The Census Historical Environmental Impacts Frame

    October 2024

    Working Paper Number:

    CES-24-66

    The Census Bureau's Environmental Impacts Frame (EIF) is a microdata infrastructure that combines individual-level information on residence, demographics, and economic characteristics with environmental amenities and hazards from 1999 through the present day. To better understand the long-run consequences and intergenerational effects of exposure to a changing environment, we expand the EIF by extending it backward to 1940. The Historical Environmental Impacts Frame (HEIF) combines the Census Bureau's historical administrative data, publicly available 1940 address information from the 1940 Decennial Census, and historical environmental data. This paper discusses the creation of the HEIF as well as the unique challenges that arise with using the Census Bureau's historical administrative data.
    View Full Paper PDF
  • Working Paper

    Nonresponse and Coverage Bias in the Household Pulse Survey: Evidence from Administrative Data

    October 2024

    Working Paper Number:

    CES-24-60

    The Household Pulse Survey (HPS) conducted by the U.S. Census Bureau is a unique survey that provided timely data on the effects of the COVID-19 Pandemic on American households and continues to provide data on other emergent social and economic issues. Because the survey has a response rate in the single digits and only has an online response mode, there are concerns about nonresponse and coverage bias. In this paper, we match administrative data from government agencies and third-party data to HPS respondents to examine how representative they are of the U.S. population. For comparison, we create a benchmark of American Community Survey (ACS) respondents and nonrespondents and include the ACS respondents as another point of reference. Overall, we find that the HPS is less representative of the U.S. population than the ACS. However, performance varies across administrative variables, and the existing weighting adjustments appear to greatly improve the representativeness of the HPS. Additionally, we look at household characteristics by their email domain to examine the effects on coverage from limiting email messages in 2023 to addresses from the contact frame with at least 90% deliverability rates, finding no clear change in the representativeness of the HPS afterwards.
    View Full Paper PDF
  • Working Paper

    Gradient Boosting to Address Statistical Problems Arising from Non-Linkage of Census Bureau Datasets

    June 2024

    Working Paper Number:

    CES-24-27

    This article introduces the twangRDC package, which contains functions to address non-linkage in US Census Bureau datasets. The Census Bureau's Person Identification Validation System facilitates data linkage by assigning unique person identifiers to federal, third party, decennial census, and survey data. Not all records in these datasets can be linked to the reference file and as such not all records will be assigned an identifier. This article is a tutorial for using the twangRDC to generate nonresponse weights to account for non-linkage of person records across US Census Bureau datasets.
    View Full Paper PDF
  • Working Paper

    Revisiting Methods to Assign Responses when Race and Hispanic Origin Reporting are Discrepant Across Administrative Records and Third Party Sources

    May 2024

    Authors: James Noon

    Working Paper Number:

    CES-24-26

    The Best Race and Ethnicity Administrative Records Composite file ('Best Race file') is an composite file which combines Census, federal, and Third Party Data (TPD) sources and applies business rules to assign race and ethnicity values to person records. The first version of the Best Race administrative records composite was first constructed in 2015 and subsequently updated each year to include more recent vintages, when available, of the data sources originally included in the composite file. Where updates were available for data sources, the most recent information for persons was retained, and the business rules were reapplied to assign a single race and single Hispanic origin value to each person record. The majority of person records on the Best Race file have consistent race and ethnicity information across data sources. Where there are discrepancies in responses across data sources, we apply a series of business rules to assign a single race and ethnicity to each record. To improve the quality of the Best Race administrative records composite, we have begun revising the business rules which were developed several years ago. This paper discusses the original business rules as well as the implemented changes and their impact on the composite file.
    View Full Paper PDF
  • Working Paper

    Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children

    March 2024

    Working Paper Number:

    CES-24-18

    This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
    View Full Paper PDF
  • Working Paper

    The Changing Nature of Pollution, Income, and Environmental Inequality in the United States

    January 2024

    Working Paper Number:

    CES-24-04

    This paper uses administrative tax records linked to Census demographic data and high-resolution measures of fine small particulate (PM2.5) exposure to study the evolution of the Black-White pollution exposure gap over the past 40 years. In doing so, we focus on the various ways in which income may have contributed to these changes using a statistical decomposition. We decompose the overall change in the Black-White PM2.5 exposure gap into (1) components that stem from rank-preserving compression in the overall pollution distribution and (2) changes that stem from a reordering of Black and White households within the pollution distribution. We find a significant narrowing of the Black-White PM2.5 exposure gap over this time period that is overwhelmingly driven by rank-preserving changes rather than positional changes. However, the relative positions of Black and White households at the upper end of the pollution distribution have meaningfully shifted in the most recent years.
    View Full Paper PDF
  • Working Paper

    Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey

    January 2024

    Working Paper Number:

    CES-24-02

    Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations. After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics. This paper is for research purposes only. No changes to production are being implemented at this time.
    View Full Paper PDF
  • Working Paper

    The 2010 Census Confidentiality Protections Failed, Here's How and Why

    December 2023

    Working Paper Number:

    CES-23-63

    Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act.
    View Full Paper PDF
  • Working Paper

    When and Why Does Nonresponse Occur? Comparing the Determinants of Initial Unit Nonresponse and Panel Attrition

    September 2023

    Authors: Tiffany S. Neman

    Working Paper Number:

    CES-23-44

    Though unit nonresponse threatens data quality in both cross-sectional and panel surveys, little is understood about how initial nonresponse and later panel attrition may be theoretically or empirically distinct phenomena. This study advances current knowledge of the determinants of both unit nonresponse and panel attrition within the context of the U.S. Census Bureau's Survey of Income and Program Participation (SIPP) panel survey, which I link with high-quality federal administrative records, paradata, and geographic data. By exploiting the SIPP's interpenetrated sampling design and relying on cross-classified random effects modeling, this study quantifies the relative effects of sample household, interviewer, and place characteristics on baseline nonresponse and later attrition, addressing a critical gap in the literature. Given the reliance on successful record linkages between survey sample households and federal administrative data in the nonresponse research, this study also undertakes an explicitly spatial analysis of the place-based characteristics associated with successful record linkages in the U.S.
    View Full Paper PDF
  • Working Paper

    Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report

    April 2023

    Working Paper Number:

    CES-23-21

    This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.
    View Full Paper PDF