CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'race census'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Viewing papers 1 through 10 of 13


  • Working Paper

    Estimating the Potential Impact of Combined Race and Ethnicity Reporting on Long-Term Earnings Statistics

    September 2024

    Working Paper Number:

    CES-24-48

    We use place of birth information from the Social Security Administration linked to earnings data from the Longitudinal Employer-Household Dynamics Program and detailed race and ethnicity data from the 2010 Census to study how long-term earnings differentials vary by place of birth for different self-identified race and ethnicity categories. We focus on foreign-born persons from countries that are heavily Hispanic and from countries in the Middle East and North Africa (MENA). We find substantial heterogeneity of long-term earnings differentials within country of birth, some of which will be difficult to detect when the reporting format changes from the current two-question version to the new single-question version because they depend on self-identifications that place the individual in two distinct categories within the single-question format, specifically, Hispanic and White or Black, and MENA and White or Black. We also study the USA-born children of these same immigrants. Long-term earnings differences for the 2nd generation also vary as a function of self-identified ethnicity and race in ways that changing to the single-question format could affect.
    View Full Paper PDF
  • Working Paper

    Revisiting Methods to Assign Responses when Race and Hispanic Origin Reporting are Discrepant Across Administrative Records and Third Party Sources

    May 2024

    Authors: James M. Noon

    Working Paper Number:

    CES-24-26

    The Best Race and Ethnicity Administrative Records Composite file ('Best Race file') is an composite file which combines Census, federal, and Third Party Data (TPD) sources and applies business rules to assign race and ethnicity values to person records. The first version of the Best Race administrative records composite was first constructed in 2015 and subsequently updated each year to include more recent vintages, when available, of the data sources originally included in the composite file. Where updates were available for data sources, the most recent information for persons was retained, and the business rules were reapplied to assign a single race and single Hispanic origin value to each person record. The majority of person records on the Best Race file have consistent race and ethnicity information across data sources. Where there are discrepancies in responses across data sources, we apply a series of business rules to assign a single race and ethnicity to each record. To improve the quality of the Best Race administrative records composite, we have begun revising the business rules which were developed several years ago. This paper discusses the original business rules as well as the implemented changes and their impact on the composite file.
    View Full Paper PDF
  • Working Paper

    Who are the people in my neighborhood? The 'contextual fallacy' of measuring individual context with census geographies

    February 2018

    Working Paper Number:

    CES-18-11

    Scholars deploy census-based measures of neighborhood context throughout the social sciences and epidemiology. Decades of research confirm that variation in how individuals are aggregated into geographic units to create variables that control for social, economic or political contexts can dramatically alter analyses. While most researchers are aware of the problem, they have lacked the tools to determine its magnitude in the literature and in their own projects. By using confidential access to the complete 2010 U.S. Decennial Census, we are able to construct'for all persons in the US'individual-specific contexts, which we group according to the Census-assigned block, block group, and tract. We compare these individual-specific measures to the published statistics at each scale, and we then determine the magnitude of variation in context for an individual with respect to the published measures using a simple statistic, the standard deviation of individual context (SDIC). For three key measures (percent Black, percent Hispanic, and Entropy'a measure of ethno-racial diversity), we find that block-level Census statistics frequently do not capture the actual context of individuals within them. More problematic, we uncover systematic spatial patterns in the contextual variables at all three scales. Finally, we show that within-unit variation is greater in some parts of the country than in others. We publish county-level estimates of the SDIC statistics that enable scholars to assess whether mis-specification in context variables is likely to alter analytic findings when measured at any of the three common Census units.
    View Full Paper PDF
  • Working Paper

    Decennial Census Return Rates: The Role of Social Capital

    January 2017

    Working Paper Number:

    CES-17-39

    This paper explores how useful information about social and civic engagement (social capital) might be to the U.S. Census Bureau in their efforts to improve predictions of mail return rates for the Decennial Census (DC) at the census tract level. Through construction of Hard-to-count (HRC) scores and multivariate analysis, we find that if information about social capital were available, predictions of response rates would be marginally improved.
    View Full Paper PDF
  • Working Paper

    Assessing Coverage and Quality of the 2007 Prototype Census Kidlink Database

    September 2015

    Working Paper Number:

    carra-2015-07

    The Census Bureau is conducting research to expand the use of administrative records data in censuses and surveys to decrease respondent burden and reduce costs while improving data quality. Much of this research (e.g., Rastogi and O''Hara (2012), Luque and Bhaskar (2014)) hinges on the ability to integrate multiple data sources by linking individuals across files. One of the Census Bureau's record linkage methodologies for data integration is the Person Identification Validation System or PVS. PVS assigns anonymous and unique IDs (Protected Identification Keys or PIKs) that serve as linkage keys across files. Prior research showed that integrating 'known associates' information into PVS's reference files could potentially enhance PVS's PIK assignment rates. The term 'known associates' refers to people that are likely to be associated with each other because of a known common link (such as family relationships or people sharing a common address), and thus, to be observed together in different files. One of the results from this prior research was the creation of the 2007 Census Kidlink file, a child-level file linking a child's Social Security Number (SSN) record to the SSN of those identified as the child's parents. In this paper, we examine to what extent the 2007 Census Kidlink methodology was able to link parents SSNs to children SSN records, and also evaluate the quality of those links. We find that in approximately 80 percent of cases, at least one parent was linked to the child's record. Younger children and noncitizens have a higher percentage of cases where neither parent could be linked to the child. Using 2007 tax data as a benchmark, our quality evaluation results indicate that in at least 90 percent of the cases, the parent-child link agreed with those found in the tax data. Based on our findings, we propose improvements to the 2007 Kidlink methodology to increase child-parent links, and discuss how the creation of the file could be operationalized moving forward.
    View Full Paper PDF
  • Working Paper

    Evaluating Race and Hispanic Origin Responses of Medicaid Participants Using Census Data

    April 2015

    Working Paper Number:

    carra-2015-01

    Health and health care disparities associated with race or Hispanic origin are complex and continue to challenge researchers and policy makers. With the intention of improving the measurement and monitoring of these disparities, provisions of the Patient Protection and Affordable Care Act (ACA) of 2010 require states to collect, report and analyze data on demographic characteristics of applicants and participants in Medicaid and other federally supported programs. By linking Medicaid records to 2010 Census, American Community Survey, and Census 2000, this new large-scale study examines and documents the extent to which pre-ACA Medicaid administrative records match self-reported race and Hispanic origin in Census data. Linked records allow comparisons between individuals with matching and non-matching race and Hispanic origin data across several demographic, socioeconomic and neighborhood characteristics, such as age, gender, language proficiency, education and Census tract variables. Identification of the groups most likely to have non-matching and missing race and Hispanic origin data in Medicaid relative to Census data can inform strategies to improve the quality of demographic data collected from Medicaid populations.
    View Full Paper PDF
  • Working Paper

    Exploring Administrative Records Use for Race and Hispanic Origin Item Non-Response

    December 2014

    Working Paper Number:

    carra-2014-16

    Race and Hispanic origin data are required to produce official statistics in the United States. Data collected through the American Community Survey and decennial census address missing data through traditional imputation methods, often relying on information from neighbors. These methods work well if neighbors share similar characteristics, however, the shape and patterns of neighborhoods in the United States are changing. Administrative records may provide more accurate data compared to traditional imputation methods for missing race and Hispanic origin responses. This paper first describes the characteristics of persons with missing demographic data, then assesses the coverage of administrative records data for respondents who do not answer race and Hispanic origin questions in Census data. The paper also discusses the distributional impact of using administrative records race and Hispanic origin data to complete missing responses in a decennial census or survey context.
    View Full Paper PDF
  • Working Paper

    Dynamics of Race: Joining, Leaving, and Staying in the American Indian/Alaska Native Race Category between 2000 and 2010

    August 2014

    Working Paper Number:

    carra-2014-10

    Each census for decades has seen the American Indian and Alaska Native population increase substantially more than expected. Changes in racial reporting seem to play an important role in the observed net increases, though research has been hampered by data limitations. We address previously unanswerable questions about race response change among American Indian and Alaska Natives (hereafter 'American Indians') using uniquely-suited (but not nationally representative) linked data from the 2000 and 2010 decennial censuses (N = 3.1 million) and the 2006-2010 American Community Survey (N = 188,131). To what extent do people change responses to include or exclude American Indian? How are people who change responses similar to or different from those who do not? How are people who join a group similar to or different from those who leave it? We find considerable race response change by people in our data, especially by multiple-race and/or Hispanic American Indians. This turnover is hidden in cross-sectional comparisons because people joining the group are similar in number and characteristics to those who leave the group. People in our data who changed their race response to add or drop American Indian differ from those who kept the same race response in 2000 and 2010 and from those who moved between a single-race and multiple-race American Indian response. Those who consistently reported American Indian (including those who added or dropped another race response) were relatively likely to report a tribe, live in an American Indian area, report American Indian ancestry, and live in the West. There are significant differences between those who joined and those who left a specific American Indian response group, but poor model fit indicates general similarity between joiners and leavers. Response changes should be considered when conceptualizing and operationalizing 'the American Indian and Alaska Native population.'
    View Full Paper PDF
  • Working Paper

    America's Churning Races: Race and Ethnic Response Changes between Census 2000 and the 2010 Census

    August 2014

    Working Paper Number:

    carra-2014-09

    Race and ethnicity responses can change over time and across contexts - a component of population change not usually taken into account. To what extent do race and/or Hispanic origin responses change? Is change more common to/from some race/ethnic groups than others? Does the propensity to change responses vary by characteristics of the individual? To what extent do these changes affect researchers? We use internal Census Bureau data from the 2000 and 2010 censuses in which individuals' responses have been linked across years. Approximately 9.8 million people (about 6 percent) in our large, non-representative linked data have a different race and/or Hispanic origin response in 2010 than they did in 2000. Several groups experienced considerable fluidity in racial identification: American Indians and Alaska Natives, Native Hawaiians and Other Pacific Islanders, and multiple-race response groups, as well as Hispanics when reporting a race. In contrast, race and ethnic responses for single-race non-Hispanic whites, blacks, and Asians were relatively consistent over the decade, as were ethnicity responses by Hispanics. People who change their race and/or Hispanic origin response(s) are doing so in a wide variety of ways, as anticipated by previous research. For example, people's responses change from multiple races to a single race, from a single race to multiple races, from one single race to another, and some people add or drop a Hispanic response. The inflow of people to each race/Hispanic group is in many cases similar in size to the outflow from the same group, such that cross-sectional data would show a small net change. We find response changes across ages, sexes, regions, and response modes, with variation across groups. Researchers should consider the implications of changing race and Hispanic origin responses when conducting analyses and interpreting results.
    View Full Paper PDF
  • Working Paper

    The Person Identification Validation System (PVS): Applying the Center for Administrative Records Research and Applications' (CARRA) Record Linkage Software

    July 2014

    Working Paper Number:

    carra-2014-01

    The Census Bureau's Person Identification Validation System (PVS) assigns unique person identifiers to federal, commercial, census, and survey data to facilitate linkages across and within files. PVS uses probabilistic matching to assign a unique Census Bureau identifier for each person. The PVS matches incoming files to reference files created with data from the Social Security Administration (SSA) Numerical Identification file, and SSA data with addresses obtained from federal files. This paper describes the PVS methodology from editing input data to creating the final file.
    View Full Paper PDF