Introducing the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR): Description, Data Construction Methodology, and Quality Assessment
August 2022
Working Paper Number:
CES-22-29
Abstract
Document Tags and Keywords
Keywords
Keywords are automatically generated using KeyBERT, a powerful and innovative
keyword extraction tool that utilizes BERT embeddings to ensure high-quality and contextually relevant
keywords.
By analyzing the content of working papers, KeyBERT identifies terms and phrases that capture the essence of the
text, highlighting the most significant topics and trends. This approach not only enhances searchability but
provides connections that go beyond potentially domain-specific author-defined keywords.
:
payroll,
survey,
respondent,
employee,
employ,
employed,
employment estimates,
employment data,
insurance,
workforce,
tax,
associate,
employment measures,
employment statistics,
irs,
medicaid,
filing,
assessed,
income data
Tags
Tags are automatically generated using a pretrained language model from spaCy, which excels at
several tasks, including entity tagging.
The model is able to label words and phrases by part-of-speech,
including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are
identified to contain references to specific institutions, datasets, and other organizations.
:
Internal Revenue Service,
Center for Economic Studies,
Administrative Records,
Current Population Survey,
Decennial Census,
Medical Expenditure Panel Survey,
Employer Identification Numbers,
University of Minnesota,
American Community Survey,
Longitudinal Employer Household Dynamics,
Agency for Healthcare Research and Quality,
Business Register,
United States Census Bureau,
Federal Insurance Contribution Act,
Protected Identification Key,
Department of Health and Human Services,
Quarterly Workforce Indicators,
Census Bureau Disclosure Review Board,
Disclosure Review Board,
Center for Administrative Records Research,
Personally Identifiable Information,
Data Management System
Similar Working Papers
Similarity between working papers are determined by an unsupervised neural
network model
know as Doc2Vec.
Doc2Vec is a model that represents entire documents as fixed-length vectors, allowing for the
capture of semantic meaning in a way that relates to the context of words within the document. The model learns to
associate a unique vector with each document while simultaneously learning word vectors, enabling tasks such as
document classification, clustering, and similarity detection by preserving the order and structure of words. The
document vectors are compared using cosine similarity/distance to determine the most similar working papers.
Papers identified with 🔥 are in the top 20% of similarity.
The 10 most similar working papers to the working paper 'Introducing the Medical Expenditure Panel Survey-Insurance Component with Administrative Records (MEPS-ICAR): Description, Data Construction Methodology, and Quality Assessment' are listed below in order of similarity.
-
Working PaperNational Experimental Wellbeing Statistics - Version 1
February 2023
Working Paper Number:
CES-23-04
This is the U.S. Census Bureau's first release of the National Experimental Wellbeing Statistics (NEWS) project. The NEWS project aims to produce the best possible estimates of income and poverty given all available survey and administrative data. We link survey, decennial census, administrative, and third-party data to address measurement error in income and poverty statistics. We estimate improved (pre-tax money) income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research. We address biases from 1) unit nonresponse through improved weights, 2) missing income information in both survey and administrative data through improved imputation, and 3) misreporting by combining or replacing survey responses with administrative information. Reducing survey error substantially affects key measures of well-being: We estimate median household income is 6.3 percent higher than in survey estimates, and poverty is 1.1 percentage points lower. These changes are driven by subpopulations for which survey error is particularly relevant. For house holders aged 65 and over, median household income is 27.3 percent higher and poverty is 3.3 percentage points lower than in survey estimates. We do not find a significant impact on median household income for householders under 65 or on child poverty. Finally, we discuss plans for future releases: addressing other potential sources of bias, releasing additional years of statistics, extending the income concepts measured, and including smaller geographies such as state and county.View Full Paper PDF
-
Working PaperA Comparison of Employee Benefits Data from the MEPS-IC and Form 5500
September 2008
Working Paper Number:
CES-08-32
This paper compares data on employers\u2019 health and pension offerings from the two sources: publicly available administrative data from Form 5500 filings and survey data from the Insurance Component of the Medical Expenditure Panel Survey (MEPS-IC). The basic findings are that the 5500 filings cover too few health plans to be very useful as a substitute or supplement to the MEPS-IC measure of whether or not employers offer health insurance. The pension information in the 5500 filings is potentially more useful as a supplement to the MEPSIC for research purposes where additional pension information would be useful in studying employers\u2019 decisions to offer health insurance.View Full Paper PDF
-
Working PaperEmployer-Sim Microsimulation Model: Model Development and Application to Estimation of Tax Subsidies to Health Insurance
December 2014
Working Paper Number:
CES-14-46
Employment-related health coverage is the predominant form of health insurance in the nonelderly, US population. Developing sound policies regarding the tax treatment of employer-sponsored insurance requires detailed information on the insurance benefits offered by employers as well as detailed information on the characteristics of employees and their familes. Unfortunately, no nationally representative data set contains all of the necessary elements. This paper describes the development of the Employer-Sim model which models tax-based health policies by using data on workers from the Medical Expenditure Panel Survey Household Component (MEPS HC) to form synthetic workforces for each establishment in the Medical Expenditure Panel Survey Insurance Component (MEPS IC). This paper describes the application of Employer-Sim to estimating tax subsidies to employer-sponsored health insurance and presents estimates of the cost and indcidence of the subsidy for 2008. The paper concludes by discussing other potential applications of the Employer-Sim model.View Full Paper PDF
-
Working PaperNew Uses of Health and Pension Information
January 2002
Working Paper Number:
tp-2002-03
View Full Paper PDF
-
Working PaperInvestigating the Use of Administrative Records in the Consumer Expenditure Survey
March 2018
Working Paper Number:
carra-2018-01
In this paper, we investigate the potential of applying administrative records income data to the Consumer Expenditure (CE) survey to inform measurement error properties of CE estimates, supplement respondent-collected data, and estimate the representativeness of the CE survey by income level. We match individual responses to Consumer Expenditure Quarterly Interview Survey data collected from July 2013 through December 2014 to IRS administrative data in order to analyze CE questions on wages, social security payroll deductions, self-employment income receipt and retirement income. We find that while wage amounts are largely in alignment between the CE and administrative records in the middle of the wage distribution, there is evidence that wages are over-reported to the CE at the bottom of the wage distribution and under-reported at the top of the wage distribution. We find mixed evidence for alignment between the CE and administrative records on questions covering payroll deductions and self-employment income receipt, but find substantial divergence between CE responses and administrative records when examining retirement income. In addition to the analysis using person-based linkages, we also match responding and non-responding CE sample units to the universe of IRS 1040 tax returns by address to examine non-response bias. We find that non-responding households are substantially richer than responding households, and that very high income households are less likely to respond to the CE.View Full Paper PDF
-
Working PaperDescribing the Form 5500-Business Register Match
January 2003
Working Paper Number:
tp-2003-05
View Full Paper PDF
-
Working PaperMatching Compustat Data to the Longitudinal Business Database, 1976-2020
September 2025
Working Paper Number:
CES-25-65
This paper details the methodology for creating an updated Compustat-Longitudinal Business Database (LBD) bridge, facilitating linkage between company identifiers in Compustat and firm identifiers in the LBD. In addition to data from Compustat, we incorporate historical data on public companies from various public and private sources, including information on executive names. Our methodology involves a series of stages using fuzzy name and address matching, including EIN, telephone number, and industry code matching. Qualified researchers with approved proposals can access this bridge though the Federal Statistical Research Data Centers. The Compustat-SSL bridge serves as a crucial resource for longitudinal studies on U.S. businesses, corporate governance, and executive compensation.View Full Paper PDF
-
Working PaperEarnings Through the Stages: Using Tax Data to Test for Sources of Error in CPS ASEC Earnings and Inequality Measures
September 2024
Working Paper Number:
CES-24-52
In this paper, I explore the impact of generalized coverage error, item non-response bias, and measurement error on measures of earnings and earnings inequality in the CPS ASEC. I match addresses selected for the CPS ASEC to administrative data from 1040 tax returns. I then compare earnings statistics in the tax data for wage and salary earnings in samples corresponding to seven stages of the CPS ASEC survey production process. I also compare the statistics using the actual survey responses. The statistics I examine include mean earnings, the Gini coefficient, percentile earnings shares, and shares of the survey weight for a range of percentiles. I examine how the accuracy of the statistics calculated using the survey data is affected by including imputed responses for both those who did not respond to the full CPS ASEC and those who did not respond to the earnings question. I find that generalized coverage error and item nonresponse bias are dominated by measurement error, and that an important aspect of measurement error is households reporting no wage and salary earnings in the CPS ASEC when there are such earnings in the tax data. I find that the CPS ASEC sample misses earnings at the high end of the distribution from the initial selection stage and that the final survey weights exacerbate this.View Full Paper PDF
-
Working PaperEstimating the Costs of Covering Dependents through Employer-Sponsored Plans
January 2017
Working Paper Number:
CES-17-48
Several health reform microsimulation models use synthetic firms to estimate how changes in federal and state policies will affect employers' offers of health insurance, as well as the price of health insurance for workers and firms. These models typically rely on distinct measures of the average costs of single and dependent coverage, for employees and employers, which do not capture the joint distribution of these costs. Since some firms pay a large share of the premium for single polices but a lower share for dependent coverage, or the reverse, simulation models that do not account for the joint distribution of premium costs may not be sufficient to answer certain policy questions. To address this issue, we developed a method to extract estimates of the joint distribution of employer and employee costs of health insurance coverage from the Medical Expenditure Panel Survey ' Insurance Component (MEPS-IC). This paper describes how these distributions were constructed and how they were incorporated into the Urban Institute's Health Insurance Policy Simulation Model (HIPSM). The estimates presented in this paper and those available in supplementary datasets may be useful for other simulation models that need to utilize information on the joint distribution of single and dependent employee premium contributions.View Full Paper PDF
-
Working PaperThe Composition of Firm Workforces from 2006'2022: Findings from the Business Dynamics Statistics of Human Capital Experimental Product
April 2025
Working Paper Number:
CES-25-20
We introduce the Business Dynamics Statistics of Human Capital (BDS-HC) tables, a new Census Bureau experimental product that provides public-use statistics on the workforce composition of firms and its relationship to business dynamics. We use administrative W-2 filings to combine population-level worker demographic data with longitudinal business data to estimate the demographic and educational composition of nearly all non-farm employer businesses in the United States between 2006 and 2022. We use this newly constructed data to document the evolution of employment, entry, and exit of employers based on their workforce compositions. We also provide new statistics on the interaction between firm and worker characteristics, including the composition of workers at startup firms. We find substantial changes between 2006 and 2022 in the distribution of employers along several dimensions, primarily driven by changing workforce compositions within continuing firms rather than the reallocation of employment between firms. We also highlight systematic differences in the business dynamics of firms by their workforce compositions, suggesting that different groups of workers face different economic environments due to their employers.View Full Paper PDF