-
LOOKING BACK ON THREE YEARS OF USING THE SYNTHETIC LBD BETA
February 2014
Working Paper Number:
CES-14-11
Distributions of business data are typically much more skewed than those for household or individual data and public knowledge of the underlying units is greater. As a results, national statistical offices (NSOs) rarely release establishment or firm-level business microdata due to the risk to respondent confidentiality. One potential approach for overcoming these risks is to release synthetic data where the establishment data are simulated from statistical models designed to mimic the distributions of the real underlying microdata. The US Census Bureau's Center for Economic Studies in collaboration with Duke University, the National Institute of Statistical Sciences, and Cornell University made available a synthetic public use file for the Longitudinal Business Database (LBD) comprising more than 20 million records for all business establishment with paid employees dating back to 1976. The resulting product, dubbed the SynLBD, was released in 2010 and is the first-ever comprehensive business microdata set publicly released in the United States including data on establishments employment and payroll, birth and death years, and industrial classification. This pa- per documents the scope of projects that have requested and used the SynLBD.
View Full
Paper PDF
-
EXPANDING THE ROLE OF SYNTHETIC DATA AT THE U.S. CENSUS BUREAU
February 2014
Working Paper Number:
CES-14-10
National Statistical offices (NSOs) create official statistics from data collected from survey respondents, government administrative records and other sources. The raw source data is usually considered to be confidential. In the case of the U.S. Census Bureau, confidentiality of survey and administrative records microdata is mandated by statute, and this mandate to protect confidentiality is often at odds with the needs of users to extract as much information from the data as possible. Traditional disclosure protection techniques result in official data products that do not fully utilize the information content of the underlying microdata. Typically, these products take the form of simple aggregate tabulations. In a few cases anonymized public- use micro samples are made available, but these face a growing risk of re-identification by the increasing amounts of information about individuals and firms available in the public domain. One approach for overcoming these risks is to release products based on synthetic data where values are simulated from statistical models designed to mimic the (joint) distributions of the underlying microdata. We discuss re- cent Census Bureau work to develop and deploy such products. We discuss the benefits and challenges involved with extending the scope of synthetic data products in official statistics.
View Full
Paper PDF
-
Income Packaging and Economic Disconnection: Do Sources of Support Differ from Other Low-Income Women?
December 2013
Working Paper Number:
CES-13-61
Income packaging, or piecing together cash and non-cash resources from a variety of sources, is a common financial survival strategy among low-income women. This strategy is particularly important for economically disconnected women, who lack both employment income and public cash assistance receipt. Using data from the confidential Census Bureau versions of the Survey of Income and Program Participation, this study compares the use of public and private supports between disconnected and connected low-income women, controlling for differences in state welfare rules and county unemployment rates. Findings from bivariate comparisons and multilevel logistic regressions indicate that disconnected women utilize public non-cash supports at similar rates to connected women, but rely more heavily on private sources. Conclusions focus on the policy implications for outreach and program development.
View Full
Paper PDF
-
The Role of Agents and Brokers in the Market for Health Insurance
December 2013
Working Paper Number:
CES-13-58
Health insurance markets in the United States are characterized by imperfect information, complex products, and substantial search frictions. Insurance agents and brokers play a significant role in helping employers navigate these problems. However, little is known about the relation between the structure of the agent/broker market and access and affordability of insurance. This paper aims to fill this gap by investigating the influence of agents/brokers on health insurance decisions of small firms, which are particularly vulnerable to problems of financing health insurance. Using a unique membership database from the National Association of Health Underwriters together with a nationally representative survey of employers, we find that small firms in more competitive agent/broker markets are more likely to offer health insurance and at lower premiums. Moreover, premiums are less dispersed in more competitive agent/broker markets.
View Full
Paper PDF
-
A COMPARISON OF PERSON-REPORTED INDUSTRY TO EMPLOYER-REPORTED INDUSTRY
IN SURVEY AND ADMINISTRATIVE DATA
September 2013
Working Paper Number:
CES-13-47
The Census Bureau collects industry information through surveys and administrative data and creates associated public-use statistics. In this paper, we compare person-reported industry in the American Community Survey (ACS) to employer-reported industry from the Quarterly Census of Employment and Wages (QCEW) that is part of the Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) program. This research provides necessary information on the use of administrative data as a supplement to survey data industry information, and the findings will be useful for anyone using industry information from either source. Our project is part of a larger effort to compare information on jobs from household survey data to employer-reported information. This research is the first to compare ACS job data to firm-based administrative data. We find an overall industry sector match rate of 75 percent, and a 61 percent match rate at the 4-digit Census Industry Code (CIC) level. Industry match rates vary by sector and by whether industry sector is classified using ACS or LEHD industry information. The educational services and health care and social assistance sectors have among the highest match rates. The management of companies and enterprises sector has the lowest match rate, using either ACS-reported or LEHD-reported sector. For individuals with imputed industry data, the industry sector match rate is only 14 percent. Our findings suggest that the industry distribution and the sample in a particular industry sector will differ depending on whether ACS or LEHD data are used.
View Full
Paper PDF
-
Modeling Single Establishment Firm Returns to the 2007 Economic Census
September 2011
Working Paper Number:
CES-11-28
The Economic Census is one of the most important activities that the U.S. Census Bureau performs. It is critical for updating firm ownership/structure and industry information for a large number of businesses in the Census Bureau's Business Register, impacting most other economic programs. Also, it feeds into Bureau of Economic Analysis products, such as benchmark inputoutput accounts and Gross Domestic Product. The overall check-in rate for the 2007 Economic Census was just over 86%. Establishments owned by multi-location companies returned over 90% of their forms, as compared to the roughly two million single-establishment firms sampled in the Census that returned just over 80%. We model the check-in rate for single-establishment firms by using a large number of variables that might be correlated with whether or not a firm returns a form in the Economic Census. These variables are broadly categorized as the characteristics of firms, measures of external factors, and features of the survey design. We use the model for two purposes. First, by including many of the factors that may be correlated with returns we aim to focus limited advertising and outreach resources to low-return segments of the population. Second, we use the model to investigate the efficacy of an unplanned intervention expected to increase return rates: using certified mailing for one of the form follow-ups.
View Full
Paper PDF
-
A Guide to the MEPS-IC Government List Sample Microdata
September 2011
Working Paper Number:
CES-11-27
The Medical Expenditure Panel Survey-Insurance Component (MEPS-IC) is conducted to provide nationally representative estimates on employer sponsored health insurance. MEPSIC data are collected from private sector employers, as well as state and local governments. While similar information is gathered from these two sectors, differences in the survey process exist. The goal of this paper is to provide details on the public sector including types of state and local government employers, sample design, general information on the data collected in the MEPS-IC, and additional sources of information.
View Full
Paper PDF
-
Management Challenges of the 2010 U.S. Census
August 2011
Working Paper Number:
CES-11-22
This paper gives an insider's perspective on the management approaches used to manage the 2010 Census during its operational phase. The approaches used, the challenges faced (in particular, difficulties faced in automating data collection), and the solutions applied to meet those challenges are described. Finally, six management lessons learned are presented.
View Full
Paper PDF
-
Impacts of Central Business District Location: A Hedonic Analysis of Legal Service Establishments
July 2011
Working Paper Number:
CES-11-21
This analysis examines the business impacts on law firms of locating in Central Business Districts (CBDs) in major U.S. cities. Specifically, we measure the price premium that law firms pay to locate in CBDs. Using micro-level data from the 1992 and 2007 Census of Services, we find that after controlling for firm size, firm specialization characteristics, and MSA and county attributes, law firms within CBDs pay about 15 to 20 percent more in overhead compared to those firms outside CBDs ' a result consistent across time between 1992 and 2007. When including an important additional measure of firm quality, however, we find that this impact is reduced to about 7 to 9 percent, but still statistically significant. Additional results show that there is a significant correlation between firm quality and CBD location. We also find that firm size and firm specialization measures are important factors in the choice to locate within CBDs. We argue that these results indicate that CBD location for law firms may serve as networking, quality sorting, and branding mechanisms.
View Full
Paper PDF
-
Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database
February 2011
Working Paper Number:
CES-11-04
In most countries, national statistical agencies do not release establishment-level business microdata, because doing so represents too large a risk to establishments\' confidentiality. One approach with the potential for overcoming these risks is to release synthetic data; that is, the released establishment data are simulated from statistical models designed to mimic the distributions of the underlying real microdata. In this article, we describe an application of this strategy to create a public use file for the Longitudinal Business Database, an annual economic census of establishments in the United States comprising more than 20 million records dating back to 1976. The U.S. Bureau of the Census and the Internal Revenue Service recently approved the release of these synthetic microdata for public use, making the synthetic Longitudinal Business Database the first-ever business microdata set publicly released in the United States. We describe how we created the synthetic data, evaluated analytical validity, and assessed disclosure risk.
View Full
Paper PDF