Scholars deploy census-based measures of neighborhood context throughout the social sciences and epidemiology. Decades of research confirm that variation in how individuals are aggregated into geographic units to create variables that control for social, economic or political contexts can dramatically alter analyses. While most researchers are aware of the problem, they have lacked the tools to determine its magnitude in the literature and in their own projects. By using confidential access to the complete 2010 U.S. Decennial Census, we are able to construct'for all persons in the US'individual-specific contexts, which we group according to the Census-assigned block, block group, and tract. We compare these individual-specific measures to the published statistics at each scale, and we then determine the magnitude of variation in context for an individual with respect to the published measures using a simple statistic, the standard deviation of individual context (SDIC). For three key measures (percent Black, percent Hispanic, and Entropy'a measure of ethno-racial diversity), we find that block-level Census statistics frequently do not capture the actual context of individuals within them. More problematic, we uncover systematic spatial patterns in the contextual variables at all three scales. Finally, we show that within-unit variation is greater in some parts of the country than in others. We publish county-level estimates of the SDIC statistics that enable scholars to assess whether mis-specification in context variables is likely to alter analytic findings when measured at any of the three common Census units.
-
Improving Estimates of Neighborhood Change with Constant Tract Boundaries
May 2022
Working Paper Number:
CES-22-16
Social scientists routinely rely on methods of interpolation to adjust available data to their research needs. This study calls attention to the potential for substantial error in efforts to harmonize data to constant boundaries using standard approaches to areal and population interpolation. We compare estimates from a standard source (the Longitudinal Tract Data Base) to true values calculated by re-aggregating original 2000 census microdata to 2010 tract areas. We then demonstrate an alternative approach that allows the re-aggregated values to be publicly disclosed, using 'differential privacy' (DP) methods to inject random noise to protect confidentiality of the raw data. The DP estimates are considerably more accurate than the interpolated estimates. We also examine conditions under which interpolation is more susceptible to error. This study reveals cause for greater caution in the use of interpolated estimates from any source. Until and unless DP estimates can be publicly disclosed for a wide range of variables and years, research on neighborhood change should routinely examine data for signs of estimation error that may be substantial in a large share of tracts that experienced complex boundary changes.
View Full
Paper PDF
-
Metropolitan Segregation: No Breakthrough in Sight
May 2022
Working Paper Number:
CES-22-14
The 2020 Census offers new information on changes in residential segregation in metropolitan regions across the country as they continue to become more diverse. We take a long view, assessing trends since 1980 and extrapolating to the future. These new data mostly reinforce patterns that were observed a decade ago: high but slowly declining black-white segregation, and less intense but hardly changing segregation of Hispanics and Asians from whites. Enough time has passed since the civil rights era of the 1960s and 1970s to draw this conclusion: segregation will continue to divide Americans well into the 21st Century.
View Full
Paper PDF
-
Associations Between Public Housing and Individual Earnings in New Orleans
October 2015
Working Paper Number:
CES-15-32
This study uses a sample of the civilian labor force aged 16-64 constructed from the Decennial Census and American Community Survey, along with data from the HUD dataset Picture of Subsidized Households, to compare the likelihood for job earnings in relation to public housing developments in the New Orleans MSA before and after Hurricane Katrina. Results from a series of hierarchical linear models (HLM) indicate significant relationships are altered between time periods, including those from public and mixed-income developments, suggesting a fluid relationship between neighborhoods and economic outcomes during physical, demographic and economic restructuring.
View Full
Paper PDF
-
Location, Location, Location: The 3L Approach to House Price Determination
May 2004
Working Paper Number:
CES-04-06
The immobility of houses means that their location affects their values. This explains the common belief that three things determine the price of a house: location, location, and location. We use this notion to develop the 3L Approach to house price determination. That is, prices are determined by the Metropolitan Statistical Area (MSA), town, and street where the house is located. This study creates a unique data set based on data from the American Housing Survey (AHS) consisting of small 'clusters' of housing units with information on their housing characteristics and resident characteristics that is merged with census tract-level attributes. We use this data to verify the 3L Approach: we find that all three levels of location are significant when estimating the house price hedonic equation. This indicates that individuals care about their local neighborhood, i.e. the general upkeep of their street and possibly their neighbors' characteristics (cluster variables), a broader area such as the school district and/or the town (tract variables) that account for school quality and crime rates, and the particular amenities found in their MSA.
View Full
Paper PDF
-
SYNTHETIC DATA FOR SMALL AREA ESTIMATION IN THE AMERICAN COMMUNITY SURVEY
April 2013
Working Paper Number:
CES-13-19
Small area estimates provide a critical source of information used to study local populations. Statistical agencies regularly collect data from small areas but are prevented from releasing detailed geographical identifiers in public-use data sets due to disclosure concerns. Alternative data dissemination methods used in practice include releasing summary/aggregate tables, suppressing detailed geographic information in public-use data sets, and accessing restricted data via Research Data Centers. This research examines an alternative method for disseminating microdata that contains more geographical details than are currently being released in public-use data files. Specifically, the method replaces the observed survey values with imputed, or synthetic, values simulated from a hierarchical Bayesian model. Confidentiality protection is enhanced because no actual values are released. The method is demonstrated using restricted data from the 2005-2009 American Community Survey. The analytic validity of the synthetic data is assessed by comparing small area estimates obtained from the synthetic data with those obtained from the observed data.
View Full
Paper PDF
-
Peer Income Exposure Across the Income Distribution
February 2025
Working Paper Number:
CES-25-16
Children from families across the income distribution attend public schools, making schools and classrooms potential sites for interaction between more- and less-affluent children. However, limited information exists regarding the extent of economic integration in these contexts. We merge educational administrative data from Oregon with measures of family income derived from IRS records to document student exposure to economically diverse school and classroom peers. Our findings indicate that affluent children in public schools are relatively isolated from their less affluent peers, while low- and middle-income students experience relatively even peer income distributions. Students from families in the top percentile of the income distribution attend schools where 20 percent of their peers, on average, come from the top five income percentiles. A large majority of the differences in peer exposure that we observe arise from the sorting of students across schools; sorting across classrooms within schools plays a substantially smaller role.
View Full
Paper PDF
-
Resident Perceptions of Crime: How Similar are They to Official Crime Rates?
March 2007
Working Paper Number:
CES-07-10
This study compares the relationship between official crime rates and residents' perceptions of crime in census tracts. Employing a unique dataset that links household level data from the American Housing Survey metro samples over a period of 25 years (1976-2000) with official crime rate data for census tracts in selected cities during selected years, this large sample provides considerable ability to generalize the findings. I find that residents' perception of crime is most strongly related to official rates of tract violent crime. Models simultaneously taking into account both violent and property crime consistently found that property crime actually has a negative effect on perceived crime. Among types of violent crime, the robbery rate is consistently related to higher levels of perceived crime in the tract, whereas it appears a structural shift occurred in the mid-1980s in which aggravated assault and murder rates now impact perceptions of crime, even when taking into account the robbery rate.
View Full
Paper PDF
-
Validating Abstract Representations of Spatial Population Data while considering Disclosure Avoidance
February 2020
Working Paper Number:
CES-20-05
This paper furthers a research agenda for modeling populations along spatial networks and expands upon an empirical analysis to a full U.S. county (Gaboardi, 2019, Ch. 1,2). Specific foci are the necessity of, and methods for, validating and benchmarking spatial data when conducting social science research with aggregated and ambiguous population representations. In order to promote the validation of publicly-available data, access to highly-restricted census microdata was requested, and granted, in order to determine the levels of accuracy and error associated with a network-based population modeling framework. Primary findings reinforce the utility of a novel network allocation method'populated polygons to networks (pp2n) in terms of accuracy, computational complexity, and real runtime (Gaboardi, 2019, Ch. 2). Also, a pseudo-benchmark dataset's performance against the true census microdata shows promise in modeling populations along networks.
View Full
Paper PDF
-
Structural versus Ethnic Dimensions of Housing Segregation
March 2016
Working Paper Number:
CES-16-22
Racial residential segregation is still very high in many American cities. Some portion of segregation is attributable to socioeconomic differences across racial lines; some portion is caused by purely racial factors, such as preferences about the racial composition of one's neighborhood or discrimination in the housing market. Social scientists have had great difficulty disaggregating segregation into a portion that can be explained by interracial differences in socioeconomic characteristics (what we call structural factors) versus a portion attributable to racial and ethnic factors. What would such a measure look like? In this paper, we draw on a new source of data to develop an innovative structural segregation measure that shows the amount of segregation that would remain if we could assign households to housing units based only on non-racial socioeconomic characteristics. This inquiry provides vital building blocks for the broader enterprise of understanding and remedying housing segregation.
View Full
Paper PDF
-
Neighborhood Racial Status and White Out-Mobility
March 2026
Working Paper Number:
CES-26-19
Drawing on American Community Survey data, this study examines how whites' relative socioeconomic standing vis-'-vis nonwhite neighbors affects the association between minority presence and white out-mobility. Moving beyond the racial preferences versus racial proxy debate, we integrate group competition and contact theories with status theory to conceptualize 'racial status' as whites' first-order income rank relative to the subgroup status of Black, Hispanic, and Asian residents at the census tract level. Multilevel linear probability models show that whites lacking advantaged status are generally more likely to move. However, the positive association between Black or Asian concentration and white departure is weaker among status-disadvantaged whites, while the negative association with Hispanic concentration is stronger. These patterns lend greater support to contact theory than to group competition theory. By foregrounding relative status, the study demonstrates that racial and socioeconomic mechanisms are intertwined in shaping white residential mobility.
View Full
Paper PDF