CREAT: Census Research Exploration and Analysis Tool

Papers Containing Tag(s): '2010 Census'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

American Community Survey - 64

Protected Identification Key - 44

Internal Revenue Service - 37

Decennial Census - 34

Census Bureau Disclosure Review Board - 33

Current Population Survey - 33

Social Security Number - 31

Social Security Administration - 30

Center for Economic Studies - 23

Person Validation System - 23

Master Address File - 22

Social Security - 21

Department of Housing and Urban Development - 21

North American Industry Classification System - 20

Longitudinal Employer Household Dynamics - 20

Office of Management and Budget - 20

Bureau of Labor Statistics - 19

Some Other Race - 17

Disclosure Review Board - 16

Ordinary Least Squares - 16

Administrative Records - 16

1940 Census - 15

National Science Foundation - 15

Housing and Urban Development - 15

Person Identification Validation System - 14

Survey of Income and Program Participation - 13

Metropolitan Statistical Area - 13

Personally Identifiable Information - 13

Longitudinal Business Database - 12

Census 2000 - 12

Quarterly Census of Employment and Wages - 11

Federal Statistical Research Data Center - 11

SSA Numident - 11

Individual Taxpayer Identification Numbers - 11

Indian Health Service - 11

Service Annual Survey - 11

MAFID - 10

Chicago Census Research Data Center - 10

Supplemental Nutrition Assistance Program - 10

Bureau of Economic Analysis - 10

Census of Manufactures - 10

Center for Administrative Records Research and Applications - 10

Cornell University - 9

Census Edited File - 9

Economic Census - 9

National Bureau of Economic Research - 9

W-2 - 9

Employer Identification Numbers - 9

Computer Assisted Personal Interview - 9

Postal Service - 9

Research Data Center - 9

Business Register - 8

Medicaid Services - 8

Census Numident - 8

Census Bureau Business Register - 8

Temporary Assistance for Needy Families - 8

Census Household Composition Key - 8

Indian Housing Information Center - 8

Quarterly Workforce Indicators - 7

Census Bureau Person Identification Validation System - 7

American Housing Survey - 7

Unemployment Insurance - 7

Census Bureau Master Address File - 7

Consolidated Metropolitan Statistical Areas - 7

United States Census Bureau - 6

County Business Patterns - 6

MAF-ARF - 6

Centers for Medicare - 6

NUMIDENT - 6

Survey of Business Owners - 6

Annual Survey of Entrepreneurs - 6

LEHD Program - 5

Composite Person Record - 5

University of Chicago - 5

University of Maryland - 5

Opportunity Atlas - 5

Master Beneficiary Record - 5

Pew Research Center - 5

American Economic Association - 5

National Academy of Sciences - 5

Department of Justice - 5

Geographic Information Systems - 5

Special Sworn Status - 5

Federal Reserve Bank - 5

Annual Survey of Manufactures - 5

Standard Industrial Classification - 5

Office of Personnel Management - 4

Federal Poverty Level - 4

Disability Insurance - 4

Social Science Research Institute - 4

Supreme Court - 4

Computer Assisted Telephone Interviews and Computer Assisted Personal Interviews - 4

Adjusted Gross Income - 4

Public Use Micro Sample - 4

Citizenship and Immigration Services - 4

Survey of Consumer Finances - 4

National Opinion Research Center - 4

Center for Administrative Records Research - 4

Company Organization Survey - 3

Local Employment Dynamics - 3

Agriculture, Forestry - 3

Core Based Statistical Area - 3

United Nations - 3

Department of Economics - 3

National Center for Health Statistics - 3

University of California Los Angeles - 3

Department of Agriculture - 3

Sloan Foundation - 3

Department of Commerce - 3

General Education Development - 3

Data Management System - 3

Statistics Canada - 3

Generalized Method of Moments - 3

Customs and Border Protection - 3

Department of Homeland Security - 3

Cornell Institute for Social and Economic Research - 3

Federal Reserve System - 3

Integrated Public Use Microdata Series - 3

World Bank - 3

Alfred P Sloan Foundation - 3

Small Business Administration - 3

PIKed - 3

Total Factor Productivity - 3

Cobb-Douglas - 3

Current Employment Statistics - 3

Technical Services - 3

Harvard University - 3

CATI - 3

2SLS - 3

Census Bureau Center for Economic Studies - 3

University of Minnesota - 3

Minnesota Population Center - 3

American Economic Review - 3

Longitudinal Research Database - 3

Herfindahl-Hirschman - 3

population - 40

ethnicity - 31

hispanic - 31

respondent - 30

census data - 29

survey - 27

minority - 22

immigrant - 22

resident - 21

residence - 19

citizen - 19

ethnic - 19

census bureau - 18

data - 18

race - 18

racial - 17

census responses - 16

employed - 16

workforce - 15

neighborhood - 15

use census - 14

metropolitan - 14

immigration - 14

housing - 13

residential - 13

record - 13

white - 13

employ - 12

disparity - 12

2010 census - 12

data census - 12

migrant - 12

black - 12

statistical - 11

agency - 11

datasets - 11

segregation - 11

census records - 11

recession - 11

poverty - 10

census survey - 10

socioeconomic - 10

native - 10

labor - 10

assessed - 9

mexican - 9

latino - 9

imputation - 9

enrollment - 9

census use - 9

disadvantaged - 8

ancestry - 8

estimating - 8

migration - 8

employment data - 7

census employment - 7

census 2020 - 7

race census - 7

discrimination - 7

microdata - 7

percentile - 6

work census - 6

payroll - 6

employment statistics - 6

irs - 6

ssa - 6

medicaid - 6

expenditure - 6

census household - 6

citizenship - 6

matching - 6

analysis - 6

segregated - 6

employee - 6

records census - 6

worker - 6

production - 6

manufacturing - 6

econometric - 6

report - 5

research census - 5

provided census - 5

employed census - 5

rural - 5

geography - 5

suburb - 5

geographic - 5

intergenerational - 5

family - 5

sampling - 5

household surveys - 5

census linked - 5

heterogeneity - 5

1040 - 5

federal - 5

unemployed - 5

welfare - 5

state - 5

reside - 5

statistician - 5

enterprise - 5

entrepreneur - 5

job - 5

industrial - 5

census research - 5

censuses surveys - 4

survey households - 4

eligible - 4

coverage - 4

bias - 4

database - 4

asian - 4

indian - 4

disclosure - 4

estimator - 4

neighbor - 4

privacy - 4

residing - 4

department - 4

lending - 4

bank - 4

home - 4

venture - 4

entrepreneurship - 4

proprietor - 4

financial - 4

hiring - 4

endogeneity - 4

growth - 4

gdp - 4

proprietorship - 4

sale - 4

export - 4

wholesale - 4

economist - 4

sector - 4

commute - 4

demand - 4

census file - 4

clerical - 4

decade - 3

census disclosure - 3

urban - 3

urbanization - 3

city - 3

relocation - 3

rent - 3

renter - 3

generation - 3

adoption - 3

eligibility - 3

population survey - 3

propensity - 3

survey income - 3

earnings - 3

immigrated - 3

discriminatory - 3

linkage - 3

quarterly - 3

environmental - 3

borrower - 3

loan - 3

lender - 3

saving - 3

homeowner - 3

regression - 3

midwest - 3

finance - 3

capital - 3

hire - 3

assimilation - 3

geographically - 3

salary - 3

unemployment rates - 3

regress - 3

concentration - 3

manufacturer - 3

surveys censuses - 3

policy - 3

regional - 3

firms census - 3

tax - 3

revenue - 3

impact - 3

aggregate - 3

schooling - 3

country - 3

mobility - 3

occupation - 3

interracial - 3

layoff - 3

hurricane - 3

industrialized - 3

Viewing papers 11 through 20 of 99


  • Working Paper

    Revisiting Methods to Assign Responses when Race and Hispanic Origin Reporting are Discrepant Across Administrative Records and Third Party Sources

    May 2024

    Authors: James M. Noon

    Working Paper Number:

    CES-24-26

    The Best Race and Ethnicity Administrative Records Composite file ('Best Race file') is an composite file which combines Census, federal, and Third Party Data (TPD) sources and applies business rules to assign race and ethnicity values to person records. The first version of the Best Race administrative records composite was first constructed in 2015 and subsequently updated each year to include more recent vintages, when available, of the data sources originally included in the composite file. Where updates were available for data sources, the most recent information for persons was retained, and the business rules were reapplied to assign a single race and single Hispanic origin value to each person record. The majority of person records on the Best Race file have consistent race and ethnicity information across data sources. Where there are discrepancies in responses across data sources, we apply a series of business rules to assign a single race and ethnicity to each record. To improve the quality of the Best Race administrative records composite, we have begun revising the business rules which were developed several years ago. This paper discusses the original business rules as well as the implemented changes and their impact on the composite file.
    View Full Paper PDF
  • Working Paper

    Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children

    March 2024

    Working Paper Number:

    CES-24-18

    This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
    View Full Paper PDF
  • Working Paper

    Examining Racial Identity Responses Among People with Middle Eastern and North African Ancestry in the American Community Survey

    March 2024

    Working Paper Number:

    CES-24-14

    People with Middle Eastern and North African (MENA) backgrounds living in the United States are defined and classified as White by current Federal standards for race and ethnicity, yet many MENA people do not identify as White in surveys, such as those conducted by the U.S. Census Bureau. Instead, they often select 'Some Other Race', if it is provided, and write in MENA responses such as Arab, Iranian, or Middle Eastern. In processing survey data for public release, the Census Bureau classifies these responses as White in accordance with Federal guidance set by the U.S. Office of Management and Budget. Research that uses these edited public data relies on limited information on MENA people's racial identification. To address this limitation, we obtained unedited race responses in the nationally representative American Community Survey from 2005-2019 to better understand how people of MENA ancestry report their race. We also use these data to compare the demographic, cultural, socioeconomic, and contextual characteristics of MENA individuals who identify as White versus those who do not identify as White. We find that one in four MENA people do not select White alone as their racial identity, despite official guidance that defines 'White' as people having origins in any of the original peoples of Europe, the Middle East, or North Africa. A variety of individual and contextual factors are associated with this choice, and some of these factors operate differently for U.S.-born and foreign-born MENA people living in the United States.
    View Full Paper PDF
  • Working Paper

    Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey

    January 2024

    Working Paper Number:

    CES-24-02

    Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations. After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics. This paper is for research purposes only. No changes to production are being implemented at this time.
    View Full Paper PDF
  • Working Paper

    A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census: Full Technical Report

    December 2023

    Working Paper Number:

    CES-23-63R

    For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act. You are reading the full technical report. For the summary paper see https://doi.org/10.1162/99608f92.4a1ebf70.
    View Full Paper PDF
  • Working Paper

    The 2010 Census Confidentiality Protections Failed, Here's How and Why

    December 2023

    Working Paper Number:

    CES-23-63

    Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act.
    View Full Paper PDF
  • Working Paper

    Where to Build Affordable Housing? Evaluating the Tradeoffs of Location

    December 2023

    Working Paper Number:

    CES-23-62R

    How does the location of affordable housing affect tenant welfare, the distribution of assistance, and broader societal objectives such as racial integration? Using administrative data on tenants of units funded by the Low-Income Housing Tax Credit (LIHTC), we first show that characteristics such as race and proxies for need vary widely across neighborhoods. Despite fixed eligibility requirements, LIHTC developments in more opportunity-rich neighborhoods house tenants who are higher income, more educated, and far less likely to be Black. To quantify the welfare implications, we build a residential choice model in which households choose from both market-rate and affordable housing options, where the latter must be rationed. While building affordable housing in higher-opportunity neighborhoods costs more, it also increases household welfare and reduces city-wide segregation. The gains in household welfare, however, accrue to more moderate-need, non-Black/Hispanic households at the expense of other households. This change in the distribution of assistance is primarily due to a 'crowding out' effect: households that only apply for assistance in higher-opportunity neighborhoods crowd out those willing to apply regardless of location. Finally, other policy levers'such as lowering the income limits used for means-testing'have only limited effects relative to the choice of location.
    View Full Paper PDF
  • Working Paper

    Producing U.S. Population Statistics Using Multiple Administrative Sources

    November 2023

    Working Paper Number:

    CES-23-58

    We identify several challenges encountered when constructing U.S. administrative record-based (AR-based) population estimates for 2020. Though the AR estimates are higher than the 2020 Census at the national level, they are over 15 percent lower in 5 percent of counties, suggesting that locational accuracy can be improved. Other challenges include how to achieve comprehensive coverage, maintain consistent coverage across time, filter out nonresidents and people not alive on the reference date, uncover missing links across person and address records, and predict demographic characteristics when multiple ones are reported or when they are missing. We discuss several ways of addressing these issues, e.g., building in redundancy with more sources, linking children to their parents' addresses, and conducting additional record linkage for people without Social Security Numbers and for addresses not initially linked to the Census Bureau's Master Address File. We discuss modeling to predict lower levels of geography for people lacking those geocodes, the probability that a person is a U.S. resident on the reference date, the probability that an address is the person's residence on the reference date, and the probability a person is in each demographic characteristic category. Regression results illustrate how many of these challenges and solutions affect the AR county population estimates.
    View Full Paper PDF
  • Working Paper

    An In-Depth Examination of Requirements for Disclosure Risk Assessment

    October 2023

    Working Paper Number:

    CES-23-49

    The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
    View Full Paper PDF
  • Working Paper

    Noncitizen Coverage and Its Effects on U.S. Population Statistics

    August 2023

    Working Paper Number:

    CES-23-42

    We produce population estimates with the same reference date, April 1, 2020, as the 2020 Census of Population and Housing by combining 31 types of administrative record (AR) and third-party sources, including several new to the Census Bureau with a focus on noncitizens. Our AR census national population estimate is higher than other Census Bureau official estimates: 1.8% greater than the 2020 Demographic Analysis high estimate, 3.0% more than the 2020 Census count, and 3.6% higher than the vintage-2020 Population Estimates Program estimate. Our analysis suggests that inclusion of more noncitizens, especially those with unknown legal status, explains the higher AR census estimate. About 19.8% of AR census noncitizens have addresses that cannot be linked to an address in the 2020 Census collection universe, compared to 5.7% of citizens, raising the possibility that the 2020 Census did not collect data for a significant fraction of noncitizens residing in the United States under the residency criteria used for the census. We show differences in estimates by age, sex, Hispanic origin, geography, and socioeconomic characteristics symptomatic of the differences in noncitizen coverage.
    View Full Paper PDF