This report introduces a new dataset, the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR), consisting of MEPS-IC survey data on establishments and their health insurance benefits packages linked to Decennial Census data and administrative tax records on MEPS-IC establishments' workforces. These data include new measures of the characteristics of MEPS-IC establishments' parent firms, employee turnover, the full distribution of MEPS-IC workers' personal and family incomes, the geographic locations where those workers live, and improved workforce demographic detail. Next, this report details the methods used for producing the MEPS-ICAR. Broadly, the linking process begins by matching establishments' parent firms to their workforces using identifiers appearing in tax records. The linking process concludes by matching establishments to their own workforces by identifying the subset of their parent firm's workforce that best matches the expected size, total payroll, and residential geographic distribution of the establishment's workforce. Finally, this report presents statistics characterizing the match rate and the MEPS-ICAR data itself. Key results include that match rates are consistently high (exceeding 90%) across nearly all data subgroups and that the matched data exhibit a reasonable distribution of employment, payroll, and worker commute distances relative to expectations and external benchmarks. Notably, employment measures derived from tax records, but not used in the match itself, correspond with high fidelity to the employment levels that establishments report in the MEPS-IC. Cumulatively, the construction of the MEPS-ICAR significantly expands the capabilities of the MEPS-IC and presents many opportunities for analysts.
-
National Experimental Wellbeing Statistics - Version 1
February 2023
Working Paper Number:
CES-23-04
This is the U.S. Census Bureau's first release of the National Experimental Wellbeing Statistics (NEWS) project. The NEWS project aims to produce the best possible estimates of income and poverty given all available survey and administrative data. We link survey, decennial census, administrative, and third-party data to address measurement error in income and poverty statistics. We estimate improved (pre-tax money) income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research. We address biases from 1) unit nonresponse through improved weights, 2) missing income information in both survey and administrative data through improved imputation, and 3) misreporting by combining or replacing survey responses with administrative information. Reducing survey error substantially affects key measures of well-being: We estimate median household income is 6.3 percent higher than in survey estimates, and poverty is 1.1 percentage points lower. These changes are driven by subpopulations for which survey error is particularly relevant. For house holders aged 65 and over, median household income is 27.3 percent higher and poverty is 3.3 percentage points lower than in survey estimates. We do not find a significant impact on median household income for householders under 65 or on child poverty. Finally, we discuss plans for future releases: addressing other potential sources of bias, releasing additional years of statistics, extending the income concepts measured, and including smaller geographies such as state and county.
View Full
Paper PDF
-
A Comparison of Employee Benefits Data from the MEPS-IC and Form 5500
September 2008
Working Paper Number:
CES-08-32
This paper compares data on employers\u2019 health and pension offerings from the two sources: publicly available administrative data from Form 5500 filings and survey data from the Insurance Component of the Medical Expenditure Panel Survey (MEPS-IC). The basic findings are that the 5500 filings cover too few health plans to be very useful as a substitute or supplement to the MEPS-IC measure of whether or not employers offer health insurance. The pension information in the 5500 filings is potentially more useful as a supplement to the MEPSIC for research purposes where additional pension information would be useful in studying employers\u2019 decisions to offer health insurance.
View Full
Paper PDF
-
Employer-Sim Microsimulation Model:
Model Development and Application to Estimation of Tax Subsidies to Health Insurance
December 2014
Working Paper Number:
CES-14-46
Employment-related health coverage is the predominant form of health insurance in the nonelderly, US population. Developing sound policies regarding the tax treatment of employer-sponsored insurance requires detailed information on the insurance benefits offered by employers as well as detailed information on the characteristics of employees and their familes. Unfortunately, no nationally representative data set contains all of the necessary elements. This paper describes the development of the Employer-Sim model which models tax-based health policies by using data on workers from the Medical Expenditure Panel Survey Household Component (MEPS HC) to form synthetic workforces for each establishment in the Medical Expenditure Panel Survey Insurance Component (MEPS IC). This paper describes the application of Employer-Sim to estimating tax subsidies to employer-sponsored health insurance and presents estimates of the cost and indcidence of the subsidy for 2008. The paper concludes by discussing other potential applications of the Employer-Sim model.
View Full
Paper PDF
-
Matching Compustat Data to the Longitudinal Business Database, 1976-2020
September 2025
Working Paper Number:
CES-25-65
This paper details the methodology for creating an updated Compustat-Longitudinal Business Database (LBD) bridge, facilitating linkage between company identifiers in Compustat and firm identifiers in the LBD. In addition to data from Compustat, we incorporate historical data on public companies from various public and private sources, including information on executive names. Our methodology involves a series of stages using fuzzy name and address matching, including EIN, telephone number, and industry code matching. Qualified researchers with approved proposals can access this bridge though the Federal Statistical Research Data Centers. The Compustat-SSL bridge serves as a crucial resource for longitudinal studies on U.S. businesses, corporate governance, and executive compensation.
View Full
Paper PDF
-
Estimating the Costs of Covering Dependents through Employer-Sponsored Plans
January 2017
Working Paper Number:
CES-17-48
Several health reform microsimulation models use synthetic firms to estimate how changes in federal and state policies will affect employers' offers of health insurance, as well as the price of health insurance for workers and firms. These models typically rely on distinct measures of the average costs of single and dependent coverage, for employees and employers, which do not capture the joint distribution of these costs. Since some firms pay a large share of the premium for single polices but a lower share for dependent coverage, or the reverse, simulation models that do not account for the joint distribution of premium costs may not be sufficient to answer certain policy questions. To address this issue, we developed a method to extract estimates of the joint distribution of employer and employee costs of health insurance coverage from the Medical Expenditure Panel Survey ' Insurance Component (MEPS-IC). This paper describes how these distributions were constructed and how they were incorporated into the Urban Institute's Health Insurance Policy Simulation Model (HIPSM). The estimates presented in this paper and those available in supplementary datasets may be useful for other simulation models that need to utilize information on the joint distribution of single and dependent employee premium contributions.
View Full
Paper PDF
-
New Uses of Health and Pension Information
January 2002
Working Paper Number:
tp-2002-03
View Full
Paper PDF
-
The Composition of Firm Workforces from 2006'2022: Findings from the Business Dynamics Statistics of Human Capital Experimental Product
April 2025
Working Paper Number:
CES-25-20
We introduce the Business Dynamics Statistics of Human Capital (BDS-HC) tables, a new Census Bureau experimental product that provides public-use statistics on the workforce composition of firms and its relationship to business dynamics. We use administrative W-2 filings to combine population-level worker demographic data with longitudinal business data to estimate the demographic and educational composition of nearly all non-farm employer businesses in the United States between 2006 and 2022. We use this newly constructed data to document the evolution of employment, entry, and exit of employers based on their workforce compositions. We also provide new statistics on the interaction between firm and worker characteristics, including the composition of workers at startup firms. We find substantial changes between 2006 and 2022 in the distribution of employers along several dimensions, primarily driven by changing workforce compositions within continuing firms rather than the reallocation of employment between firms. We also highlight systematic differences in the business dynamics of firms by their workforce compositions, suggesting that different groups of workers face different economic environments due to their employers.
View Full
Paper PDF
-
Declines in Employer Sponsored Coverage Between 2000 and 2008: Offers, Take-Up, Premium Contributions, and Dependent Options
September 2010
Working Paper Number:
CES-10-23
Even before the current economic downturn, rates of employer-sponsored insurance (ESI) declined substantially, falling six percentage points between 2000 and 2008 for nonelderly Americans. During a previously documented decline in ESI, from 1987 to 1996, the fall was found to be the result of a reduction in enrollment or 'take-up' of offered coverage and not a decline in employer offer/eligibility rates. In this paper, we investigate the components of the more recent decline in ESI coverage by firm size, using data from the MEPS-IC, a large nationally representative survey of employers. We examine changes in offer rates, eligibility rates and take-up rates for coverage, and include a new dimension, the availability of and enrollment in dependent coverage. We investigate how these components changed for employers of different sizes and find that declining coverage rates for small firms were due to declines in both offer and take-up rates while declining rates for large firms were due to declining enrollment in offered coverage. We also find a decrease in the availability of dependent coverage at small employers and a shift towards single coverage across employers of all sizes. Understanding the components of the decline in coverage for small and large firms is important for establishing the baseline for observing the effects of the current economic downturn and the implementation of health insurance reform.
View Full
Paper PDF
-
Methodology on Creating the U.S. Linked Retail Health Clinic (LiRHC) Database
March 2023
Working Paper Number:
CES-23-10
Retail health clinics (RHCs) are a relatively new type of health care setting and understanding the role they play as a source of ambulatory care in the United States is important. To better understand these settings, a joint project by the Census Bureau and National Center for Health Statistics used data science techniques to link together data on RHCs from Convenient Care Association, County Business Patterns Business Register, and National Plan and Provider Enumeration System to create the Linked RHC (LiRHC, pronounced 'lyric') database of locations throughout the United States during the years 2018 to 2020. The matching methodology used to perform this linkage is described, as well as the benchmarking, match statistics, and manual review and quality checks used to assess the resulting matched data. The large majority (81%) of matches received quality scores at or above 75/100, and most matches were linked in the first two (of eight) matching passes, indicating high confidence in the final linked dataset. The LiRHC database contained 2,000 RHCs and found that 97% of these clinics were in metropolitan statistical areas and 950 were in the South region of the United States. Through this collaborative effort, the Census Bureau and National Center for Health Statistics strive to understand how RHCs can potentially impact population health as well as the access and provision of health care services across the nation.
View Full
Paper PDF
-
Two Perspectives on Commuting: A Comparison of Home to Work Flows Across Job-Linked Survey and Administrative Files
January 2017
Working Paper Number:
CES-17-34
Commuting flows and workplace employment data have a wide constituency of users including urban and regional planners, social science and transportation researchers, and businesses. The U.S. Census Bureau releases two, national data products that give the magnitude and characteristics of home to work flows. The American Community Survey (ACS) tabulates households' responses on employment, workplace, and commuting behavior. The Longitudinal Employer-Household Dynamics (LEHD) program tabulates administrative records on jobs in the LEHD Origin-Destination Employment Statistics (LODES). Design differences across the datasets lead to divergence in a comparable statistic: county-to-county aggregate commute flows. To understand differences in the public use data, this study compares ACS and LEHD source files, using identifying information and probabilistic matching to join person and job records. In our assessment, we compare commuting statistics for job frames linked on person, employment status, employer, and workplace and we identify person and job characteristics as well as design features of the data frames that explain aggregate differences. We find a lower rate of within-county commuting and farther commutes in LODES. We attribute these greater distances to differences in workplace reporting and to uncertainty of establishment assignments in LEHD for workers at multi-unit employers. Minor contributing factors include differences in residence location and ACS workplace edits. The results of this analysis and the data infrastructure developed will support further work to understand and enhance commuting statistics in both datasets.
View Full
Paper PDF