CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'census data'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

American Community Survey - 43

Internal Revenue Service - 35

Protected Identification Key - 35

Current Population Survey - 31

Social Security Number - 30

Social Security Administration - 30

2010 Census - 29

Decennial Census - 27

Center for Economic Studies - 25

Census Bureau Disclosure Review Board - 23

Longitudinal Employer Household Dynamics - 20

Person Validation System - 20

Master Address File - 20

Bureau of Labor Statistics - 19

Disclosure Review Board - 19

Research Data Center - 19

Service Annual Survey - 18

Survey of Income and Program Participation - 17

Social Security - 16

North American Industry Classification System - 15

Business Register - 15

Cornell University - 15

National Science Foundation - 15

Employer Identification Numbers - 14

Federal Statistical Research Data Center - 13

Personally Identifiable Information - 13

Standard Statistical Establishment List - 13

Standard Industrial Classification - 13

1940 Census - 12

Economic Census - 12

Housing and Urban Development - 12

Person Identification Validation System - 12

Administrative Records - 12

Longitudinal Business Database - 11

MAFID - 11

Metropolitan Statistical Area - 11

Office of Management and Budget - 10

Some Other Race - 10

SSA Numident - 10

American Housing Survey - 10

Department of Housing and Urban Development - 9

Supplemental Nutrition Assistance Program - 9

Census Numident - 9

Individual Taxpayer Identification Numbers - 9

Alfred P Sloan Foundation - 9

Federal Tax Information - 9

National Opinion Research Center - 8

Quarterly Workforce Indicators - 8

Quarterly Census of Employment and Wages - 8

Census Bureau Business Register - 8

Indian Health Service - 8

Annual Survey of Manufactures - 8

Census Edited File - 7

County Business Patterns - 7

Medicaid Services - 7

Computer Assisted Personal Interview - 7

American Economic Association - 7

Ordinary Least Squares - 7

DOB - 7

Unemployment Insurance - 7

Census Bureau Person Identification Validation System - 6

Core Based Statistical Area - 6

Cornell Institute for Social and Economic Research - 6

Business Dynamics Statistics - 6

Bureau of Economic Analysis - 6

Center for Administrative Records Research and Applications - 6

Securities and Exchange Commission - 5

LEHD Program - 5

Employment History File - 5

Employer Characteristics File - 5

Individual Characteristics File - 5

Local Employment Dynamics - 5

Centers for Medicare - 5

Data Management System - 5

Temporary Assistance for Needy Families - 5

Indian Housing Information Center - 5

Statistics Canada - 5

Special Sworn Status - 5

Business Employment Dynamics - 5

PIKed - 5

Census 2000 - 5

Business Master File - 5

Business Register Bridge - 5

American Statistical Association - 5

Federal Reserve Bank - 5

Financial, Insurance and Real Estate Industries - 5

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 4

Health and Retirement Study - 4

Company Organization Survey - 4

CDF - 4

Composite Person Record - 4

MAF-ARF - 4

Cumulative Density Function - 4

W-2 - 4

Social Science Research Institute - 4

National Longitudinal Survey of Youth - 4

Postal Service - 4

Department of Homeland Security - 4

Urban Institute - 4

University of Maryland - 4

Bureau of Labor - 4

Sloan Foundation - 4

Successor Predecessor File - 4

National Institute on Aging - 4

National Center for Health Statistics - 4

National Bureau of Economic Research - 4

Establishment Micro Properties - 4

Agency for Healthcare Research and Quality - 4

Office of Personnel Management - 3

Department of Agriculture - 3

Census Bureau Master Address File - 3

Adjusted Gross Income - 3

Master Beneficiary Record - 3

Disability Insurance - 3

Census Household Composition Key - 3

General Education Development - 3

New England County Metropolitan - 3

Public Use Micro Sample - 3

CATI - 3

Department of Justice - 3

Citizenship and Immigration Services - 3

Yale University - 3

Department of Health and Human Services - 3

National Institutes of Health - 3

Geographic Information Systems - 3

Small Business Administration - 3

Longitudinal Research Database - 3

Harvard University - 3

Journal of Labor Economics - 3

North American Industry Classi - 3

Chicago Census Research Data Center - 3

Census of Manufactures - 3

Economic Research Service - 3

Minnesota Population Center - 3

Organization for Economic Cooperation and Development - 3

Department of Labor - 3

General Accounting Office - 3

Permanent Plant Number - 3

Medical Expenditure Panel Survey - 3

survey - 44

population - 38

census bureau - 38

data census - 33

respondent - 32

data - 26

use census - 22

resident - 20

record - 18

agency - 18

statistical - 17

ethnicity - 15

census survey - 15

datasets - 15

microdata - 15

citizen - 15

census research - 15

hispanic - 14

report - 14

economic census - 14

research census - 13

residence - 13

residential - 12

housing - 12

census use - 12

information census - 11

minority - 11

disparity - 11

neighborhood - 11

database - 10

payroll - 10

estimating - 10

census records - 10

assessed - 9

workforce - 9

2010 census - 9

census linked - 9

matching - 9

disadvantaged - 9

employee - 8

census responses - 8

employed - 8

irs - 8

poverty - 8

metropolitan - 8

census years - 8

records census - 8

immigrant - 8

socioeconomic - 8

longitudinal - 8

sampling - 7

work census - 7

employ - 7

censuses surveys - 7

linked census - 7

imputation - 7

census file - 7

enterprise - 6

disclosure - 6

identifier - 6

percentile - 6

census employment - 6

provided census - 6

household surveys - 6

coverage - 6

race - 6

linkage - 6

racial - 6

enrollment - 6

race census - 6

employer household - 6

ethnic - 6

migration - 6

expenditure - 6

employment data - 5

employment statistics - 5

employee data - 5

medicaid - 5

prevalence - 5

urban - 5

geographic - 5

federal - 5

family - 5

immigration - 5

confidentiality - 5

rural - 5

statistician - 5

longitudinal employer - 5

ancestry - 5

migrant - 5

census business - 5

labor - 5

aging - 5

assessing - 4

decade - 4

ssa - 4

survey households - 4

urbanization - 4

district - 4

native - 4

bias - 4

census household - 4

tax - 4

unemployed - 4

survey income - 4

analysis - 4

information - 4

state - 4

privacy - 4

individuals census - 4

quarterly - 4

recession - 4

researcher - 4

research - 4

employment dynamics - 4

revenue - 4

aggregate - 4

matched - 4

white - 4

average - 3

trend - 3

sample - 3

incorporated - 3

department - 3

census disclosure - 3

census 2020 - 3

eligible - 3

population survey - 3

country - 3

city - 3

geography - 3

urbanized - 3

impact - 3

environmental - 3

amenity - 3

intergenerational - 3

grandparent - 3

black - 3

estimator - 3

citizenship - 3

1040 - 3

segregation - 3

child - 3

yearly - 3

business data - 3

businesses census - 3

geographically - 3

suburb - 3

community - 3

workplace - 3

worker - 3

clerical - 3

surveys censuses - 3

residing - 3

firms census - 3

study - 3

demography - 3

migrating - 3

suburbanization - 3

associate - 3

econometric - 3

Viewing papers 11 through 20 of 75


  • Working Paper

    Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children

    March 2024

    Working Paper Number:

    CES-24-18

    This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
    View Full Paper PDF
  • Working Paper

    The Changing Nature of Pollution, Income, and Environmental Inequality in the United States

    January 2024

    Working Paper Number:

    CES-24-04

    This paper uses administrative tax records linked to Census demographic data and high-resolution measures of fine small particulate (PM2.5) exposure to study the evolution of the Black-White pollution exposure gap over the past 40 years. In doing so, we focus on the various ways in which income may have contributed to these changes using a statistical decomposition. We decompose the overall change in the Black-White PM2.5 exposure gap into (1) components that stem from rank-preserving compression in the overall pollution distribution and (2) changes that stem from a reordering of Black and White households within the pollution distribution. We find a significant narrowing of the Black-White PM2.5 exposure gap over this time period that is overwhelmingly driven by rank-preserving changes rather than positional changes. However, the relative positions of Black and White households at the upper end of the pollution distribution have meaningfully shifted in the most recent years.
    View Full Paper PDF
  • Working Paper

    Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey

    January 2024

    Working Paper Number:

    CES-24-02

    Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations. After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics. This paper is for research purposes only. No changes to production are being implemented at this time.
    View Full Paper PDF
  • Working Paper

    A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report

    December 2023

    Working Paper Number:

    CES-23-63R

    For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act. You are reading the full technical report. For the summary paper see https://doi.org/10.1162/99608f92.4a1ebf70.
    View Full Paper PDF
  • Working Paper

    The 2010 Census Confidentiality Protections Failed, Here's How and Why

    December 2023

    Working Paper Number:

    CES-23-63

    Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act.
    View Full Paper PDF
  • Working Paper

    When and Why Does Nonresponse Occur? Comparing the Determinants of Initial Unit Nonresponse and Panel Attrition

    September 2023

    Authors: Tiffany S. Neman

    Working Paper Number:

    CES-23-44

    Though unit nonresponse threatens data quality in both cross-sectional and panel surveys, little is understood about how initial nonresponse and later panel attrition may be theoretically or empirically distinct phenomena. This study advances current knowledge of the determinants of both unit nonresponse and panel attrition within the context of the U.S. Census Bureau's Survey of Income and Program Participation (SIPP) panel survey, which I link with high-quality federal administrative records, paradata, and geographic data. By exploiting the SIPP's interpenetrated sampling design and relying on cross-classified random effects modeling, this study quantifies the relative effects of sample household, interviewer, and place characteristics on baseline nonresponse and later attrition, addressing a critical gap in the literature. Given the reliance on successful record linkages between survey sample households and federal administrative data in the nonresponse research, this study also undertakes an explicitly spatial analysis of the place-based characteristics associated with successful record linkages in the U.S.
    View Full Paper PDF
  • Working Paper

    Estimating the U.S. Citizen Voting-Age Population (CVAP) Using Blended Survey Data, Administrative Record Data, and Modeling: Technical Report

    April 2023

    Working Paper Number:

    CES-23-21

    This report develops a method using administrative records (AR) to fill in responses for nonresponding American Community Survey (ACS) housing units rather than adjusting survey weights to account for selection of a subset of nonresponding housing units for follow-up interviews and for nonresponse bias. The method also inserts AR and modeling in place of edits and imputations for ACS survey citizenship item nonresponses. We produce Citizen Voting-Age Population (CVAP) tabulations using this enhanced CVAP method and compare them to published estimates. The enhanced CVAP method produces a 0.74 percentage point lower citizen share, and it is 3.05 percentage points lower for voting-age Hispanics. The latter result can be partly explained by omissions of voting-age Hispanic noncitizens with unknown legal status from ACS household responses. Weight adjustments may be less effective at addressing nonresponse bias under those conditions.
    View Full Paper PDF
  • Working Paper

    Building the Prototype Census Environmental Impacts Frame

    April 2023

    Working Paper Number:

    CES-23-20

    The natural environment is central to all aspects of life, but efforts to quantify its influence have been hindered by data availability and measurement constraints. To mitigate some of these challenges, we introduce a new prototype of a microdata infras tructure: the Census Environmental Impacts Frame (EIF). The EIF provides detailed individual-level information on demographics, economic characteristics, and address level histories ' linked to spatially and temporally resolved estimates of environmental conditions for each individual ' for almost every resident in the United States over the past two decades. This linked microdata infrastructure provides a unique platform for advancing our understanding about the distribution of environmental amenities and hazards, when, how, and why exposures have evolved over time, and the consequences of environmental inequality and changing environmental conditions. We describe the construction of the EIF, explore issues of coverage and data quality, document patterns and trends in individual exposure to two correlated but distinct air pollutants as an application of the EIF, and discuss implications and opportunities for future research.
    View Full Paper PDF
  • Working Paper

    The Long-run Effects of the 1930s Redlining Maps on Children

    December 2022

    Working Paper Number:

    CES-22-56

    We estimate the long-run effects of the 1930s Home Owners Loan Corporation (HOLC) redlining maps by linking children in the full count 1940 Census to 1) the universe of IRS tax data in 1974 and 1979 and 2) the long form 2000 Census. We use two identification strategies to estimate the potential long-run effects of differential access to credit along HOLC boundaries. The first strategy compares cross-boundary differences along HOLC boundaries to a comparison group of boundaries that had statistically similar pre-existing differences as the actual boundaries. A second approach only uses boundaries that were least likely to have been chosen by the HOLC based on our statistical model. We find that children living on the lower-graded side of HOLC boundaries had significantly lower levels of educational attainment, reduced income in adulthood, and lived in neighborhoods during adulthood characterized by lower educational attainment, higher poverty rates, and higher rates of single-headed households.
    View Full Paper PDF
  • Working Paper

    Using Small-Area Estimation (SAE) to Estimate Prevalence of Child Health Outcomes at the Census Regional-, State-, and County-Levels

    November 2022

    Working Paper Number:

    CES-22-48

    In this study, we implement small-area estimation to assess the prevalence of child health outcomes at the county, state, and regional levels, using national survey data.
    View Full Paper PDF