This report summarizes the potential for teacher lists obtained from commercial vendors for enhancing sampling frames for the National Teacher and Principal Survey (NTPS). We investigate three separate vendor lists, and compare coverage rates across a range of school and teacher characteristics. Across all vendors, coverage rates are higher for regular, non-charter schools. Vendor A stands out as having higher coverage rates than the other two, and we recommend further evaluating Vendor A's teacher lists during the upcoming 2014-2015 NTPS Field Test.
-
The Effect of Class Size on Teacher Attrition: Evidence from Class Size Reduction Policies in New York State
February 2010
Working Paper Number:
CES-10-05
Starting in 1999, New York State implemented class size reduction policies targeted at early elementary grades, but due to funding limitations, most schools reduced class size in some grades and not others. I use class size variation within a school induced by the policies to construct instrumental variable estimates of the effect of class size on teacher attrition. Teachers with smaller classes were not significantly less likely to leave schools in the full sample of districts but were less likely to leave a school in districts that targeted the same grade across schools. District-wide class size reduction policies were more likely to persist in the same grade in the next year, suggesting that teacher expectations of continued smaller classes played a role in their decision whether or not to leave a school. A decrease in class size from 23 to 20 students (a decrease of one standard deviation) under a district-wide policy decreases the probability that a teacher leaves a school by 4.2 percentage points.
View Full
Paper PDF
-
Some Open Questions on Multiple-Source Extensions of Adaptive-Survey Design Concepts and Methods
February 2023
Working Paper Number:
CES-23-03
Adaptive survey design is a framework for making data-driven decisions about survey data collection operations. This paper discusses open questions related to the extension of adaptive principles and capabilities when capturing data from multiple data sources. Here, the concept of 'design' encompasses the focused allocation of resources required for the production of high-quality statistical information in a sustainable and cost-effective way. This conceptual framework leads to a discussion of six groups of issues including: (i) the goals for improvement through adaptation; (ii) the design features that are available for adaptation; (iii) the auxiliary data that may be available for informing adaptation; (iv) the decision rules that could guide adaptation; (v) the necessary systems to operationalize adaptation; and (vi) the quality, cost, and risk profiles of the proposed adaptations (and how to evaluate them). A multiple data source environment creates significant opportunities, but also introduces complexities that are a challenge in the production of high-quality statistical information.
View Full
Paper PDF
-
Assessing Coverage and Quality of the 2007 Prototype Census Kidlink Database
September 2015
Working Paper Number:
carra-2015-07
The Census Bureau is conducting research to expand the use of administrative records data in censuses and surveys to decrease respondent burden and reduce costs while improving data quality. Much of this research (e.g., Rastogi and O''Hara (2012), Luque and Bhaskar (2014)) hinges on the ability to integrate multiple data sources by linking individuals across files. One of the Census Bureau's record linkage methodologies for data integration is the Person Identification Validation System or PVS. PVS assigns anonymous and unique IDs (Protected Identification Keys or PIKs) that serve as linkage keys across files. Prior research showed that integrating 'known associates' information into PVS's reference files could potentially enhance PVS's PIK assignment rates. The term 'known associates' refers to people that are likely to be associated with each other because of a known common link (such as family relationships or people sharing a common address), and thus, to be observed together in different files. One of the results from this prior research was the creation of the 2007 Census Kidlink file, a child-level file linking a child's Social Security Number (SSN) record to the SSN of those identified as the child's parents. In this paper, we examine to what extent the 2007 Census Kidlink methodology was able to link parents SSNs to children SSN records, and also evaluate the quality of those links. We find that in approximately 80 percent of cases, at least one parent was linked to the child's record. Younger children and noncitizens have a higher percentage of cases where neither parent could be linked to the child. Using 2007 tax data as a benchmark, our quality evaluation results indicate that in at least 90 percent of the cases, the parent-child link agreed with those found in the tax data. Based on our findings, we propose improvements to the 2007 Kidlink methodology to increase child-parent links, and discuss how the creation of the file could be operationalized moving forward.
View Full
Paper PDF
-
Comparison of Survey, Federal, and Commercial Address Data Quality
June 2014
Working Paper Number:
carra-2014-06
This report summarizes matching of survey, commercial, and administrative records housing units to the Census Bureau Master Address File (MAF). We document overall MAF match rates in each data set and evaluate differences in match rates across a variety of housing characteristics. Results show that over 90 percent of records in survey data from the American Housing Survey (AHS) match to the MAF. Commercial data from CoreLogic matches at much lower rates, in part due to missing address information and poor match rates for multi-unit buildings. MAF match rates for administrative records from the Department of Housing and Urban Development are also high, and open the possibility of using this information in surveys such as the AHS.
View Full
Paper PDF
-
Where Are Your Parents? Exploring Potential Bias in Administrative Records on Children
March 2024
Working Paper Number:
CES-24-18
This paper examines potential bias in the Census Household Composition Key's (CHCK) probabilistic parent-child linkages. By linking CHCK data to the American Community Survey (ACS), we reveal disparities in parent-child linkages among specific demographic groups and find that characteristics of children that can and cannot be linked to the CHCK vary considerably from the larger population. In particular, we find that children from low-income, less educated households and of Hispanic origin are less likely to be linked to a mother or a father in the CHCK. We also highlight some data considerations when using the CHCK.
View Full
Paper PDF
-
Person Matching in Historical Files using the Census Bureau's Person Validation System
September 2014
Working Paper Number:
carra-2014-11
The recent release of the 1940 Census manuscripts enables the creation of longitudinal data spanning the whole of the twentieth century. Linked historical and contemporary data would allow unprecedented analyses of the causes and consequences of health, demographic, and economic change. The Census Bureau is uniquely equipped to provide high quality linkages of person records across datasets. This paper summarizes the linkage techniques employed by the Census Bureau and discusses utilization of these techniques to append protected identification keys to the 1940 Census.
View Full
Paper PDF
-
School Discipline and Racial Disparities in Early Adulthood
June 2021
Working Paper Number:
CES-21-14
Despite interest in the role of school discipline in the creation of racial inequality, previous research has been unable to identify how students who receive suspensions in school differ from unsuspended classmates on key young adult outcomes. We utilize novel data to document the links between high school discipline and important young adult outcomes related to criminal justice contact, social safety net program participation, post-secondary education, and the labor market. We show that the link between school discipline and young adult outcomes tends to be stronger for Black students than for White students, and that inequality in exposure to school discipline accounts for approximately 30 percent of the Black-White disparities in young adult criminal justice outcomes and SNAP receipt.
View Full
Paper PDF
-
Comparing the 2019 American Housing Survey to Contemporary Sources of Property Tax Records: Implications for Survey Efficiency and Quality
June 2022
Working Paper Number:
CES-22-22
Given rising nonresponse rates and concerns about respondent burden, government statistical agencies have been exploring ways to supplement household survey data collection with administrative records and other sources of third-party data. This paper evaluates the potential of property tax assessment records to improve housing surveys by comparing these records to responses from the 2019 American Housing Survey. Leveraging the U.S. Census Bureau's linkage infrastructure, we compute the fraction of AHS housing units that could be matched to a unique property parcel (coverage rate), as well as the extent to which survey and property tax data contain the same information (agreement rate). We analyze heterogeneity in coverage and agreement across states, housing characteristics, and 11 AHS items of interest to housing researchers. Our results suggest that partial replacement of AHS data with property data, targeted toward certain survey items or single-family detached homes, could reduce respondent burden without altering data quality. Further research into partial-replacement designs is needed and should proceed on an item-by-item basis. Our work can guide this research as well as those who wish to conduct independent research with property tax records that is representative of the U.S. housing stock.
View Full
Paper PDF
-
The Design of Sampling Strata for the National Household Food Acquisition and Purchase Survey
February 2025
Working Paper Number:
CES-25-13
The National Household Food Acquisition and Purchase Survey (FoodAPS), sponsored by the United States Department of Agriculture's (USDA) Economic Research Service (ERS) and Food and Nutrition Service (FNS), examines the food purchasing behavior of various subgroups of the U.S. population. These subgroups include participants in the Supplemental Nutrition Assistance Program (SNAP) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), as well as households who are eligible for but don't participate in these programs. Participants in these social protection programs constitute small proportions of the U.S. population; obtaining an adequate number of such participants in a survey would be challenging absent stratified sampling to target SNAP and WIC participating households. This document describes how the U.S. Census Bureau (which is planning to conduct future versions of the FoodAPS survey on behalf of USDA) created sampling strata to flag the FoodAPS targeted subpopulations using machine learning applications in linked survey and administrative data. We describe the data, modeling techniques, and how well the sampling flags target low-income households and households receiving WIC and SNAP benefits. We additionally situate these efforts in the nascent literature on the use of big data and machine learning for the improvement of survey efficiency.
View Full
Paper PDF
-
School Accountability and Residential Location Patterns: Evaluating the Unintended Consequences of No Child Left Behind
January 2017
Working Paper Number:
CES-17-28
The 2002 to 2015 No Child Left Behind (NCLB) Act is often considered the most significant federal intervention into education in the United States since 1965 with the passage of the Elementary and Secondary Education Act. There is growing evidence that holding schools accountable is leading to some improved educational outcomes for students. There is in contrast very little work examining whether these sweeping reforms have unintended consequences for the communities which these schools are serving. As school attendance, particularly at the elementary school level, is closely tied to one's residence, placing sanctions on a school could have negative repercussions for neighborhoods if it provides new information on school failure. In contrast, if these sanctions also bring new resources, including financial resources or school choice, they could spark additional demand within a neighborhood. Through the use of restricted access census data, which includes local housing values, rents and individual residential choices in combination with the use of a boundary discontinuity identification strategy, this paper seeks to examine how failure to meet Adequate Yearly Progress (AYP), the key enforcement mechanism of NCLB, is shaping local housing markets and residential choices in five diverse urban school districts: New York, Los Angeles, Philadelphia, Detroit and Tucson.
View Full
Paper PDF