CREAT: Census Research Exploration and Analysis Tool

Papers Containing Tag(s): 'United States Census Bureau'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Viewing papers 1 through 10 of 24


  • Working Paper

    Consequences of Eviction for Parenting and Non-parenting College Students

    June 2025

    Working Paper Number:

    CES-25-35

    Amidst rising and increasingly unaffordable rents, 7.6 million people are threatened with eviction each year across the United States'and eviction rates are twice as high for renters with children. One important and neglected population who may experience unique levels of housing insecurity is college students, especially given that one in five college students are parents. In this study, we link 11.9 million student records to eviction filings from housing courts, demographic characteristics reported in decennial census and survey data, incomes reported on tax returns by students and their parents, and dates of birth and death from the Social Security Administration. Parenting students are more likely than non-parenting students to identify as female (62.81% vs. 55.94%) and Black (19.66% vs. 14.30%), be over 30 years old (42.73% vs. 20.25%), and have parents with lower household incomes ($100,000 vs. $140,000). Parenting students threatened with eviction (i.e., had an eviction filed against them) are much more likely than non-threatened parenting students to identify as female (81.18% vs. 62.81%) and Black (56.84% vs. 19.66%). In models adjusted for individual and institutional characteristics, we find that being threatened with an eviction was significantly associated with reduced likelihood of degree completion, reduced post-enrollment income, reduced likelihood of being married post-enrollment, and increased post-enrollment mortality. Among parenting students, 38.38% (95% confidence interval (CI): 32.50-44.26%) of non-threatened students completed a bachelor's degree compared to just 15.36% (CI: 11.61-19.11%) of students threatened with eviction. Our findings highlight the long-term economic and health impacts of housing insecurity during college, especially for parenting students. Housing stability for parenting students may have substantial multigenerational benefits for economic mobility and population health.
    View Full Paper PDF
  • Working Paper

    Food Security Status Across the Rural-Urban Continuum Before and During the COVID-19 Pandemic

    January 2025

    Working Paper Number:

    CES-25-01

    Background: Food security, defined as consistent access to sufficient food to support an active life, is a crucial social determinant of health. A key dimension affecting food security is position along the rural-urban continuum, as there are important socio-economic and environmental differences between communities related to urbanicity or rurality that impact food access. The COVID-19 pandemic created social and economic shocks that altered financial and food security, which may have had differential effects by rurality and urbanicity. However, there has been limited research on how food security differs across the shades of the rural-urban community spectrum, as most often researchers have characterized communities as either urban or rural. Methods: In this study, which linked restricted use Current Population Survey Food Security Supplement data to census-tract level United States Department of Agriculture Rural-Urban Commuting Area codes, we estimated the prevalence of household food security across temporal (2015-2019 versus 2020-2021) and socio-spatial (urban, large rural city/town, small rural town, or isolated rural town/area) dimensions in order to characterize variations before and during the COVID-19 pandemic by urbanicity/rurality. We report prevalences as point estimates with 95% confidence intervals. Results: The prevalence of food security was 87.7% (87.5-88.0%) in 2015-2019 and 88.8% (88.4-89.3%) in 2020-2021 for urban areas, 85.5% (84.7-86.2%) in 2015-2019 and 87.1% (85.7-88.3%) in 2020-2021 for large rural towns/cities, 82.8% (81.5-84.1%) in 2015-2019 and 87.3% (85.7-89.2%) in 2020-2021 for small rural towns, and 87.6% (86.3-88.8%) in 2015-2019 and 90.9% (88.7-92.7%) in 2020-2021 for isolated rural towns/areas. Conclusion: These findings show that rural communities experiences of food security vary and aggregating households in these environments may mask areas of concern and concentrated need.
    View Full Paper PDF
  • Working Paper

    Exploring New Ways to Classify Industries for Energy Analysis and Modeling

    November 2022

    Working Paper Number:

    CES-22-49

    Combustion, other emitting processes and fossil energy use outside the power sector have become urgent concerns given the United States' commitment to achieving net-zero greenhouse gas emissions by 2050. Industry is an important end user of energy and relies on fossil fuels used directly for process heating and as feedstocks for a diverse range of applications. Fuel and energy use by industry is heterogeneous, meaning even a single product group can vary broadly in its production routes and associated energy use. In the United States, the North American Industry Classification System (NAICS) serves as the standard for statistical data collection and reporting. In turn, data based on NAICS are the foundation of most United States energy modeling. Thus, the effectiveness of NAICS at representing energy use is a limiting condition for current expansive planning to improve energy efficiency and alternatives to fossil fuels in industry. Facility-level data could be used to build more detail into heterogeneous sectors and thus supplement data from Bureau of the Census and U.S Energy Information Administration reporting at NAICS code levels but are scarce. This work explores alternative classification schemes for industry based on energy use characteristics and validates an approach to estimate facility-level energy use from publicly available greenhouse gas emissions data from the U.S. Environmental Protection Agency (EPA). The approaches in this study can facilitate understanding of current, as well as possible future, energy demand. First, current approaches to the construction of industrial taxonomies are summarized along with their usefulness for industrial energy modeling. Unsupervised machine learning techniques are then used to detect clusters in data reported from the U.S. Department of Energy's Industrial Assessment Center program. Clusters of Industrial Assessment Center data show similar levels of correlation between energy use and explanatory variables as three-digit NAICS codes. Interestingly, the clusters each include a large cross section of NAICS codes, which lends additional support to the idea that NAICS may not be particularly suited for correlation between energy use and the variables studied. Fewer clusters are needed for the same level of correlation as shown in NAICS codes. Initial assessment shows a reasonable level of separation using support vector machines with higher than 80% accuracy, so machine learning approaches may be promising for further analysis. The IAC data is focused on smaller and medium-sized facilities and is biased toward higher energy users for a given facility type. Cladistics, an approach for classification developed in biology, is adapted to energy and process characteristics of industries. Cladistics applied to industrial systems seeks to understand the progression of organizations and technology as a type of evolution, wherein traits are inherited from previous systems but evolve due to the emergence of inventions and variations and a selection process driven by adaptation to pressures and favorable outcomes. A cladogram is presented for evolutionary directions in the iron and steel sector. Cladograms are a promising tool for constructing scenarios and summarizing directions of sectoral innovation. The cladogram of iron and steel is based on the drivers of energy use in the sector. Phylogenetic inference is similar to machine learning approaches as it is based on a machine-led search of the solution space, therefore avoiding some of the subjectivity of other classification systems. Our prototype approach for constructing an industry cladogram is based on process characteristics according to the innovation framework derived from Schumpeter to capture evolution in a given sector. The resulting cladogram represents a snapshot in time based on detailed study of process characteristics. This work could be an important tool for the design of scenarios for more detailed modeling. Cladograms reveal groupings of emerging or dominant processes and their implications in a way that may be helpful for policymakers and entrepreneurs, allowing them to see the larger picture, other good ideas, or competitors. Constructing a cladogram could be a good first step to analysis of many industries (e.g. nitrogenous fertilizer production, ethyl alcohol manufacturing), to understand their heterogeneity, emerging trends, and coherent groupings of related innovations. Finally, validation is performed for facility-level energy estimates from the EPA Greenhouse Gas Reporting Program. Facility-level data availability continues to be a major challenge for industrial modeling. The method outlined by (McMillan et al. 2016; McMillan and Ruth 2019) allows estimating of facility level energy use based on mandatory greenhouse gas reporting. The validation provided here is an important step for further use of this data for industrial energy modeling.
    View Full Paper PDF
  • Working Paper

    Introducing the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR): Description, Data Construction Methodology, and Quality Assessment

    August 2022

    Working Paper Number:

    CES-22-29

    This report introduces a new dataset, the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR), consisting of MEPS-IC survey data on establishments and their health insurance benefits packages linked to Decennial Census data and administrative tax records on MEPS-IC establishments' workforces. These data include new measures of the characteristics of MEPS-IC establishments' parent firms, employee turnover, the full distribution of MEPS-IC workers' personal and family incomes, the geographic locations where those workers live, and improved workforce demographic detail. Next, this report details the methods used for producing the MEPS-ICAR. Broadly, the linking process begins by matching establishments' parent firms to their workforces using identifiers appearing in tax records. The linking process concludes by matching establishments to their own workforces by identifying the subset of their parent firm's workforce that best matches the expected size, total payroll, and residential geographic distribution of the establishment's workforce. Finally, this report presents statistics characterizing the match rate and the MEPS-ICAR data itself. Key results include that match rates are consistently high (exceeding 90%) across nearly all data subgroups and that the matched data exhibit a reasonable distribution of employment, payroll, and worker commute distances relative to expectations and external benchmarks. Notably, employment measures derived from tax records, but not used in the match itself, correspond with high fidelity to the employment levels that establishments report in the MEPS-IC. Cumulatively, the construction of the MEPS-ICAR significantly expands the capabilities of the MEPS-IC and presents many opportunities for analysts.
    View Full Paper PDF
  • Working Paper

    Climate Change, The Food Problem, and the Challenge of Adaptation through Sectoral Reallocation

    September 2021

    Authors: Ishan Nath

    Working Paper Number:

    CES-21-29

    This paper combines local temperature treatment effects with a quantitative macroeconomic model to assess the potential for global reallocation between agricultural and non-agricultural production to reduce the costs of climate change. First, I use firm-level panel data from a wide range of countries to show that extreme heat reduces productivity less in manufacturing and services than in agriculture, implying that hot countries could achieve large potential gains through adapting to global warming by shifting labor toward manufacturing and increasing imports of food. To investigate the likelihood that such gains will be realized, I embed the estimated productivity effects in a model of sectoral specialization and trade covering 158 countries. Simulations suggest that climate change does little to alter the geography of agricultural production, however, as high trade barriers in developing countries temper the influence of shifting comparative advantage. Instead, climate change accentuates the existing pattern, known as 'the food problem,' in which poor countries specialize heavily in relatively low productivity agricultural sectors to meet subsistence consumer needs. The productivity effects of climate change reduce welfare by 6-10% for the poorest quartile of the world with trade barriers held at current levels, but by nearly 70% less in an alternative policy counterfactual that moves low-income countries to OECD levels of trade openness.
    View Full Paper PDF
  • Working Paper

    Whose Job Is It Anyway? Co-Ethnic Hiring in New U.S. Ventures

    March 2021

    Working Paper Number:

    CES-21-05

    We explore co-ethnic hiring among new ventures using U.S. administrative data. Co-ethnic hiring is ubiquitous among immigrant groups, averaging about 22.5% and ranging from 2% to 40%. Co-ethnic hiring grows with the size of the local ethnic workforce, greater linguistic distance to English, lower cultural/genetic similarity to U.S. natives, and in harsher policy environments for immigrants. Co ethnic hiring is remarkably persistent for ventures and for individuals. Co-ethnic hiring is associated with greater venture survival and growth when thick local ethnic employment surrounds the business. Our results are consistent with a blend of hiring due to information advantages within ethnic groups with some taste-based hiring.
    View Full Paper PDF
  • Working Paper

    The Modern Wholesaler: Global Sourcing, Domestic Distribution, and Scale Economies

    December 2018

    Authors: Sharat Ganapati

    Working Paper Number:

    CES-18-49

    Nearly half of all transactions in the $6 trillion market for manufactured goods in the United States were intermediated by wholesalers in 2012, up from 32 percent in 1992. Seventy percent of this increase is due to the growth of 'superstar' firms - the largest one percent of wholesalers. Structural estimates based on detailed administrative data show that the rise of the largest wholesalers was driven by an intuitive linkage between their sourcing of goods from abroad and an expansion of their domestic distribution network to reach more buyers. Both elements require scale economies and lead to increased wholesaler market shares and markups. Counterfactual analysis shows that despite increases in wholesaler market power, intermediated international trade has two benefits for buyers: directly through buyers' valuation of globally sourced products, and indirectly through the passed-through benefits of wholesaler economies of scale and increased quality.
    View Full Paper PDF
  • Working Paper

    Making a Motivated Manager: A Census Data Investigation into Efficiency Differences Between Franchisee and Franchisor-Owned Restaurants

    January 2016

    Working Paper Number:

    CES-16-54

    While there has been significant research on the reasons for franchising, little work has examined the effects of franchising on establishment performance. This paper attempts to fill that gap. We use restricted-access US Census Bureau microdata from the 2007 Census of Retail Trade to examine establishment-level productivity of franchisee- and franchisor-owned restaurants. We do this by employing a two-stage data envelopment analysis model where the first stage uses DEA to measure each establishment's efficiency. The DEA efficiency score is then used as the second-stage dependent variable. The results show a strong and robust effect attributed to franchisee ownership for full service restaurants, but a smaller and insignificant difference for limited service restaurants. We believe the differences in task programability between limited and full service restaurants results in a very different role for managers/franchisees and is the driving factor behind the different results.
    View Full Paper PDF
  • Working Paper

    Estimation and Inference in Regression Discontinuity Designs with Clustered Sampling

    August 2015

    Working Paper Number:

    carra-2015-06

    Regression Discontinuity (RD) designs have become popular in empirical studies due to their attractive properties for estimating causal effects under transparent assumptions. Nonetheless, most popular procedures assume i.i.d. data, which is not reasonable in many common applications. To relax this assumption, we derive the properties of traditional non-parametric estimators in a setting that incorporates potential clustering at the level of the running variable, and propose an accompanying optimal-MSE bandwidth selection rule. Simulation results demonstrate that falsely assuming data are i.i.d. when selecting the bandwidth may lead to the choice of bandwidths that are too small relative to the optimal-MSE bandwidth. Last, we apply our procedure using person-level microdata that exhibits clustering at the census tract level to analyze the impact of the Low-Income Housing Tax Credit program on neighborhood characteristics and low-income housing supply.
    View Full Paper PDF
  • Working Paper

    An outside view: What do observers say about others' races and Hispanic origins?

    August 2015

    Working Paper Number:

    carra-2015-05

    Outsiders' views of a person's race or Hispanic origin can impact how she sees herself, how she reports her race and Hispanic origins, and her social and economic experiences. The way outsiders describe non-strangers in terms of their race and Hispanic origin may reveal popular assumptions about which race/Hispanic categories are salient for Americans, which kinds of people are seen as multiracial, and the types of cues people use when identifying another person's race. We study patterns of observer identification using a unique, large, linked data source with two measures of a person's race and Hispanic origin. One measure (from Census 2000 or the 2010 Census) was provided by a household respondent and the other (from the other census year) was provided by a census proxy reporter (e.g., a neighbor) who responded on behalf of a non-responsive household. We ask: Does an outsider's report of a person's race and Hispanic origin match a household report? We find that in about 90% of our 3.7 million (nonrepresentative) cases, proxy reports of a person's race and Hispanic origin match responses given by the household in a different census year. Match rates are high for the largest groups: non-Hispanic whites, blacks, and Asians and for Hispanics, though proxies are not very able to replicate the race responses of Hispanics. Matches are much less common for people in smaller groups (American Indian/Alaska Native, Pacific Islander, Some Other Race, and multiracial). We also ask: What predicts a matched response and what predicts a particular unmatched response? We find evidence of the persistence of hypodescent for blacks and hyperdescent for American Indians. Biracial Asian-whites and Pacific Islander-whites are more often seen by others as non-Hispanic white than as people of color. Proxy reporters tend to identify children as multiple race and elders as single race, whether they are or not. The race/Hispanic composition of the tract is more powerfully predictive of a particular unmatched response than are tract-level measures of socioeconomic status; unmatched responses are often consistent with the race/Hispanic characteristics of the neighborhood.
    View Full Paper PDF