In this study, we implement small-area estimation to assess the prevalence of child health outcomes at the county, state, and regional levels, using national survey data.
-
SYNTHETIC DATA FOR SMALL AREA ESTIMATION IN THE AMERICAN COMMUNITY SURVEY
April 2013
Working Paper Number:
CES-13-19
Small area estimates provide a critical source of information used to study local populations. Statistical agencies regularly collect data from small areas but are prevented from releasing detailed geographical identifiers in public-use data sets due to disclosure concerns. Alternative data dissemination methods used in practice include releasing summary/aggregate tables, suppressing detailed geographic information in public-use data sets, and accessing restricted data via Research Data Centers. This research examines an alternative method for disseminating microdata that contains more geographical details than are currently being released in public-use data files. Specifically, the method replaces the observed survey values with imputed, or synthetic, values simulated from a hierarchical Bayesian model. Confidentiality protection is enhanced because no actual values are released. The method is demonstrated using restricted data from the 2005-2009 American Community Survey. The analytic validity of the synthetic data is assessed by comparing small area estimates obtained from the synthetic data with those obtained from the observed data.
View Full
Paper PDF
-
Connected and Uncooperative: The Effects of Homogenous and Exclusive Social Networks on Survey Response Rates and Nonresponse Bias
January 2024
Working Paper Number:
CES-24-01
Social capital, the strength of people's friendship networks and community ties, has been hypothesized as an important determinant of survey participation. Investigating this hypothesis has been difficult given data constraints. In this paper, we provide insights by investigating how response rates and nonresponse bias in the American Community Survey are correlated with county-level social network data from Facebook. We find that areas of the United States where people have more exclusive and homogenous social networks have higher nonresponse bias and lower response rates. These results provide further evidence that the effects of social capital may not be simply a matter of whether people are socially isolated or not, but also what types of social connections people have and the sociodemographic heterogeneity of their social networks.
View Full
Paper PDF
-
Disconnected Geography: A Spatial Analysis of Disconnected Youth in the United States
January 2016
Working Paper Number:
CES-16-37
Since the Great Recession, US policy and advocacy groups have sought to better understand its effect on a group of especially vulnerable young adults who are not enrolled in school or training programs and not participating in the labor market, so called 'disconnected youth.' This article distinguishes between disconnected youth and unemployed youth and examines the spatial clustering of these two groups across counties in the US. The focus is to ascertain whether there are differences in underlying contextual factors among groups of counties that are mutually exclusive and spatially disparate (non-adjacent), comprising two types of spatial clusters ' high rates of disconnected youth and high rates of unemployed youth. Using restricted, household-level census data inside the Census Research Data Center (RDC) under special permission by the US Census Bureau, we were able to define these two groups using detailed household questionnaires that are not available to researchers outside the RDC. The geospatial patterns in the two types of clusters suggest that places with high concentrations of disconnected youth are distinctly different in terms of underlying characteristics from places with high concentrations of unemployed youth. These differences include, among other things, arrests for synthetic drug production, enclaves of poor in rural areas, persistent poverty in areas, educational attainment in the populace, children in poverty, persons without health insurance, the
social capital index, and elders who receive disability benefits. This article provides some preliminary evidence regarding the social forces underlying the two types of observed geospatial clusters and discusses how they differ.
View Full
Paper PDF
-
Gradient Boosting to Address Statistical Problems Arising from Non-Linkage of Census Bureau Datasets
June 2024
Working Paper Number:
CES-24-27
This article introduces the twangRDC package, which contains functions to address non-linkage in US Census Bureau datasets. The Census Bureau's Person Identification Validation System facilitates data linkage by assigning unique person identifiers to federal, third party, decennial census, and survey data. Not all records in these datasets can be linked to the reference file and as such not all records will be assigned an identifier. This article is a tutorial for using the twangRDC to generate nonresponse weights to account for non-linkage of person records across US Census Bureau datasets.
View Full
Paper PDF
-
Food Security Status Across the Rural-Urban Continuum Before and During the COVID-19 Pandemic
January 2025
Working Paper Number:
CES-25-01
Background: Food security, defined as consistent access to sufficient food to support an active life, is a crucial social determinant of health. A key dimension affecting food security is position along the rural-urban continuum, as there are important socio-economic and environmental differences between communities related to urbanicity or rurality that impact food access. The COVID-19 pandemic created social and economic shocks that altered financial and food security, which may have had differential effects by rurality and urbanicity. However, there has been limited research on how food security differs across the shades of the rural-urban community spectrum, as most often researchers have characterized communities as either urban or rural.
Methods: In this study, which linked restricted use Current Population Survey Food Security Supplement data to census-tract level United States Department of Agriculture Rural-Urban Commuting Area codes, we estimated the prevalence of household food security across temporal (2015-2019 versus 2020-2021) and socio-spatial (urban, large rural city/town, small rural town, or isolated rural town/area) dimensions in order to characterize variations before and during the COVID-19 pandemic by urbanicity/rurality. We report prevalences as point estimates with 95% confidence intervals.
Results: The prevalence of food security was 87.7% (87.5-88.0%) in 2015-2019 and 88.8% (88.4-89.3%) in 2020-2021 for urban areas, 85.5% (84.7-86.2%) in 2015-2019 and 87.1% (85.7-88.3%) in 2020-2021 for large rural towns/cities, 82.8% (81.5-84.1%) in 2015-2019 and 87.3% (85.7-89.2%) in 2020-2021 for small rural towns, and 87.6% (86.3-88.8%) in 2015-2019 and 90.9% (88.7-92.7%) in 2020-2021 for isolated rural towns/areas.
Conclusion: These findings show that rural communities experiences of food security vary and aggregating households in these environments may mask areas of concern and concentrated need.
View Full
Paper PDF
-
Neighborhood Effects on High-School Drop-Out Rates and Teenage Childbearing: Tests for Non-Linearities, Race-Specific Effects, Interactions with Family Characteristics, and Endogenous Causation using Geocoded California Census Microdata
May 2008
Working Paper Number:
CES-08-12
This paper examines the relationship between neighborhood characteristics and the likelihood that a youth will drop out of high school or have a child during the teenage years. Using a dataset that is uniquely wellsuited to the study of neighborhood effects, the impact of the neighborhood poverty rate and the percentage of professionals in the local labor force on youth outcomes in California is examined. The first section of the paper tests for non-linearities in the relationship between indicators of neighborhood distress and youth outcomes. Some evidence is found for a break-point at low levels of poverty. Suggestive but inconclusive evidence is also found for a second breakpoint, at very high levels of poverty, for African-American youth only. The second part of the paper examines interactions between family background characteristics and neighborhood effects, and finds that White youth are most sensitive to neighborhood effects, while the effect of parental education depends on the neighborhood measure in question. Among White youth, those from single-parent households are more vulnerable to neighborhood conditions. The third section of the paper finds that for White youth and Hispanic youth, the relevant neighborhood variables appear to be the own-race poverty rates and the percentage of professionals of youths' own race. The final section of the paper estimates a tract-fixed effects model, using the results from the third section to define multiple relevant poverty rates within each tract. The fixed-effects specification suggests that for White and Hispanic youth in California, neighborhood effects remain significant, even with the inclusion of controls for any unobserved family and neighborhood characteristics that are constant within tracts.
View Full
Paper PDF
-
Public-Use vs. Restricted-Use:
An Analysis Using the American Community Survey
January 2017
Working Paper Number:
CES-17-12
Statistical agencies frequently publish microdata that have been altered to protect confidentiality. Such data retain utility for many types of broad analyses but can yield biased or Insufficiently precise results in others. Research access to de-identified versions of the restricted-use data with little or no alteration is often possible, albeit costly and time-consuming. We investigate the the advantages and disadvantages of public-use and restricted-use data from the American Community
Survey (ACS) in constructing a wage index. The public-use data used were Public Use Microdata Samples, while the restricted-use data were accessed via a Federal Statistical Research Data Center. We discuss the advantages and disadvantages of each data source and compare estimated CWIs and standard errors at the state and labor market levels.
View Full
Paper PDF
-
Some Open Questions on Multiple-Source Extensions of Adaptive-Survey Design Concepts and Methods
February 2023
Working Paper Number:
CES-23-03
Adaptive survey design is a framework for making data-driven decisions about survey data collection operations. This paper discusses open questions related to the extension of adaptive principles and capabilities when capturing data from multiple data sources. Here, the concept of 'design' encompasses the focused allocation of resources required for the production of high-quality statistical information in a sustainable and cost-effective way. This conceptual framework leads to a discussion of six groups of issues including: (i) the goals for improvement through adaptation; (ii) the design features that are available for adaptation; (iii) the auxiliary data that may be available for informing adaptation; (iv) the decision rules that could guide adaptation; (v) the necessary systems to operationalize adaptation; and (vi) the quality, cost, and risk profiles of the proposed adaptations (and how to evaluate them). A multiple data source environment creates significant opportunities, but also introduces complexities that are a challenge in the production of high-quality statistical information.
View Full
Paper PDF
-
Access Methods for United States Microdata
August 2007
Working Paper Number:
CES-07-25
Beyond the traditional methods of tabulations and public-use microdata samples, statistical agencies have developed four key alternatives for providing non-government researchers with access to confidential microdata to improve statistical modeling. The first, licensing, allows qualified researchers access to confidential microdata at their own facilities, provided certain security requirements are met. The second, statistical data enclaves, offer qualified researchers restricted access to confidential economic and demographic data at specific agency-controlled locations. Third, statistical agencies can offer remote access, through a computer interface, to the confidential data under automated or manual controls. Fourth, synthetic data developed from the original data but retaining the correlations in the original data have the potential for allowing a wide range of analyses.
View Full
Paper PDF
-
Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children
March 2024
Working Paper Number:
CES-24-18
This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
View Full
Paper PDF