-
Nonemployer Statistics by Demographics (NES-D): Using Administrative and Census Records Data in Business Statistics
January 2019
Working Paper Number:
CES-19-01
The quinquennial Survey of Business Owners or SBO provided the only comprehensive source of information in the United States on employer and nonemployer businesses by the sex, race, ethnicity and veteran status of the business owners. The annual Nonemployer Statistics series (NES) provides establishment counts and receipts for nonemployers but contains no demographic information on the business owners. With the transition of the employer component of the SBO to the Annual Business Survey, the Nonemployer Statistics by Demographics series or NES-D represents the continuation of demographics estimates for nonemployer businesses. NES-D will leverage existing administrative and census records to assign demographic characteristics to the universe of approximately 24 million nonemployer businesses (as of 2015). Demographic characteristics include key demographics measured by the SBO (sex, race, Hispanic origin and veteran status) as well as other demographics (age, place of birth and citizenship status) collected but not imputed by the SBO if missing. A spectrum of administrative and census data sources will provide the nonemployer universe and demographics information. Specifically, the nonemployer universe originates in the Business Register; the Census Numident will provide sex, age, place of birth and citizenship status; race and Hispanic origin information will be obtained from multiple years of the decennial census and the American Community Survey; and the Department of Veteran Affairs will provide administrative records data on veteran status.
The use of blended data in this manner will make possible the production of NES-D, an annual series that will become the only source of detailed and comprehensive statistics on the scope, nature and activities of U.S. businesses with no paid employment by the demographic characteristics of the business owner. Using the 2015 vintage of nonemployers, initial results indicate that demographic information is available for the overwhelming majority of the universe of nonemployers. For instance, information on sex, age, place of birth and citizenship status is available for over 95 percent of the 24 million nonemployers while race and Hispanic origin are available for about 90 percent of them. These results exclude owners of C-corporations, which represent only 2 percent of nonemployer firms. Among other things, future work will entail imputation of missing demographics information (including that of C-corporations), testing the longitudinal consistency of the estimates, and expanding the set of characteristics beyond the demographics mentioned above. Without added respondent burden and at lower imputation rates and costs, NES-D will meet the needs of stakeholders as well as the economy as a whole by providing reliable estimates at a higher frequency (annual vs. every 5 years) and with a more timely dissemination schedule than the SBO.
View Full
Paper PDF
-
Early-Stage Business Formation: An Analysis of Applications for Employer Identification Numbers
December 2018
Working Paper Number:
CES-18-52
This paper reports on the development and analysis of a newly constructed dataset on the early stages of business formation. The data are based on applications for Employer Identification Numbers (EINs) submitted in the United States, known as IRS Form SS-4 filings. The goal of the research is to develop high-frequency indicators of business formation at the national, state, and local levels. The analysis indicates that EIN applications provide forward-looking and very timely information on business formation. The signal of business formation provided by counts of applications is improved by using the characteristics of the applications to model the likelihood that applicants become employer businesses. The results also suggest that EIN applications are related to economic activity at the local level. For example, application activity is higher in counties that experienced higher employment growth since the end of the Great Recession, and application counts grew more rapidly in counties engaged in shale oil and gas extraction. Finally, the paper provides a description of new public-use dataset, the 'Business Formation Statistics (BFS),' that contains new data series on business applications and formation. The initial release of the BFS shows that the number of business applications in the 3rd quarter of 2017 that have relatively high likelihood of becoming job creators is still far below pre-Great Recession levels.
View Full
Paper PDF
-
The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility
September 2018
Working Paper Number:
CES-18-42R
We construct a publicly available atlas of children's outcomes in adulthood by Census tract using anonymized longitudinal data covering nearly the entire U.S. population. For each tract, we estimate children's earnings distributions, incarceration rates, and other outcomes in adulthood by parental income, race, and gender. These estimates allow us to trace the roots of outcomes such as poverty and incarceration back to the neighborhoods in which children grew up. We find that children's outcomes vary sharply across nearby tracts: for children of parents at the 25th percentile of the income distribution, the standard deviation of mean household income at age 35 is $4,200 across tracts within counties. We illustrate how these tract-level data can provide insight into how neighborhoods shape the development of human capital and support local economic policy using two applications. First, we show that the estimates permit precise targeting of policies to improve economic opportunity by uncovering specific neighborhoods where certain subgroups of children grow up to have poor outcomes. Neighborhoods matter at a very granular level: conditional on characteristics such as poverty rates in a child's own Census tract, characteristics of tracts that are one mile away have little predictive power for a child's outcomes. Our historical estimates are informative predictors of outcomes even for children growing up today because neighborhood conditions are relatively stable over time. Second, we show that the observational estimates are highly predictive of neighborhoods' causal effects, based on a comparison to data from the Moving to Opportunity experiment and a quasi-experimental research design analyzing movers' outcomes. We then identify high-opportunity neighborhoods that are affordable to low-income families, providing an input into the design of affordable housing policies. Our measures of children's long-term outcomes are only weakly correlated with traditional proxies for local economic success such as rates of job growth, showing that the conditions that create greater upward mobility are not necessarily the same as those that lead to productive labor markets.
View Full
Paper PDF
-
Race and Economic Opportunity in the United States: An Intergenerational Perspective
September 2018
Working Paper Number:
CES-18-40R
We study the sources of racial and ethnic disparities in income using de-identified longitudinal data covering nearly the entire U.S. population from 1989-2015. We document three sets of results. First, the intergenerational persistence of disparities varies substantially across racial groups. For example, Hispanic Americans are moving up significantly in the income distribution across generations because they have relatively high rates of intergenerational income mobility. In contrast, black Americans have substantially lower rates of upward mobility and higher rates of downward mobility than whites, leading to large income disparities that persist across generations. Conditional on parent income, the black-white income gap is driven entirely by large differences in wages and employment rates between black and white men; there are no such differences between black and white women. Second, differences in family characteristics such as parental marital status, education, and wealth explain very little of the black-white income gap conditional on parent income. Differences in ability also do not explain the patterns of intergenerational mobility we document. Third, the black-white gap persists even among boys who grow up in the same neighborhood. Controlling for parental income, black boys have lower incomes in adulthood than white boys in 99% of Census tracts. Both black and white boys have better outcomes in low-poverty areas, but black-white gaps are larger on average for boys who grow up in such neighborhoods. The few areas in which black-white gaps are relatively small tend to be low-poverty neighborhoods with low levels of racial bias among whites and high rates of father presence among blacks. Black males who move to such neighborhoods earlier in childhood earn more and are less likely to be incarcerated. However, fewer than 5% of black children grow up in such environments. These findings suggest that reducing the black-white income gap will require efforts whose impacts cross neighborhood and class lines and increase upward mobility specifically for black men.
View Full
Paper PDF
-
LEHD Infrastructure S2014 files in the FSRDC
September 2018
Working Paper Number:
CES-18-27R
The Longitudinal Employer-Household Dynamics (LEHD) Program at the U.S. Census Bureau, with the support of several national research agencies, maintains a set of infrastructure files using administrative data provided by state agencies, enhanced with information from other administrative data sources, demographic and economic (business) surveys and censuses. The LEHD Infrastructure Files provide a detailed and comprehensive picture of workers, employers, and their interaction in the U.S. economy. This document describes the structure and content of the 2014 Snapshot of the LEHD Infrastructure files as they are made available in the Census Bureau's secure and restricted-access Research Data Center network. The document attempts to provide a comprehensive description of all researcher-accessible files, of their creation, and of any modifications made to the files to facilitate researcher access.
View Full
Paper PDF
-
Understanding the Quality of Alternative Citizenship Data Sources for the 2020 Census
August 2018
Working Paper Number:
CES-18-38R
This paper examines the quality of citizenship data in self-reported survey responses compared to administrative records and evaluates options for constructing an accurate count of resident U.S. citizens. Person-level discrepancies between survey-collected citizenship data and administrative records are more pervasive than previously reported in studies comparing survey and administrative data aggregates. Our results imply that survey-sourced citizenship data produce significantly lower estimates of the noncitizen share of the population than would be produced from currently available administrative records; both the survey-sourced and administrative data have shortcomings that could contribute to this difference. Our evidence is consistent with noncitizen respondents misreporting their own citizenship status and failing to report that of other household members. At the same time, currently available administrative records may miss some naturalizations and capture others with a delay. The evidence in this paper also suggests that adding a citizenship question to the 2020 Census would lead to lower self-response rates in households potentially containing noncitizens, resulting in higher fieldwork costs and a lower-quality population count.
View Full
Paper PDF
-
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
August 2018
Working Paper Number:
CES-18-35
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy.
View Full
Paper PDF
-
Using Linked Data to Investigate True Intergenerational Change: Three Generations Over Seven Decades
August 2018
Working Paper Number:
carra-2018-09
It is widely thought that immigrants and their families undergo profound cultural and socioeconomic changes as a consequence of coming into contact with U.S. society, but the way this occurs remains unclear and controversial due in large part to data limitations. In this paper, we provide proof of concept for analyses using linked data that allow us to compare outcomes across more 'exact' family generations. Specifically, we are able to follow immigrant parents and their children and grandchildren across seven decades using census and survey data from 1940 to 2014. We describe the data and linkage methodology, evaluate the representativeness of the linked sample, test a method for adjusting for biases that arise from non-representative linkages, and describe the size, diversity, and socioeconomic characteristics of the linked sample. We demonstrate that large sample sizes of linked data will likely permit us to compare several national origin groups across multiple generations.
View Full
Paper PDF
-
Foreign-Born and Native-Born Migration in the U.S.: Evidence from IRS Administrative and Census Survey Records
July 2018
Working Paper Number:
carra-2018-07
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration among foreign-born and native-born populations in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses for native-born and foreign-born respondents in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations, such as the foreign-born, that may be difficult to reach with traditional Census Bureau surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential application of our longitudinal IRS dataset to innovations in migration research on both the native-born and foreign-born populations of the United States.
View Full
Paper PDF
-
The Opportunities and Challenges of Linked IRS Administrative and Census Survey Records in the Study of Migration
July 2018
Working Paper Number:
carra-2018-06
This paper details efforts to link administrative records from the Internal Revenue Service (IRS) to American Community Survey (ACS) and 2010 Census microdata for the study of migration in the United States. Specifically, we (1) document our linkage strategy and methodology for inferring migration in IRS records; (2) model selection into and survival across IRS records to determine suitability for research applications; and (3) gauge the efficacy of the IRS records by demonstrating how they can be used to validate and potentially improve migration responses in ACS microdata. Our results show little evidence of selection or survival bias in the IRS records, suggesting broad generalizability to the nation as a whole. Moreover, we find that the combined IRS 1040, 1099, and W2 records may provide important information on populations that are hard to reach with traditional Census surveys. Finally, while preliminary, the results of our comparison of IRS and ACS migration responses shows that IRS records may be useful in improving ACS migration measurement for respondents whose migration response is proxy, allocated, or imputed. Taking these results together, we discuss the potential applications of our longitudinal IRS dataset to innovations in migration research in the United States.
View Full
Paper PDF