CREAT - Census Bureau

Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Survey and SSA Administrative Data

September 2002

Written by: Martha Stinson

Working Paper Number:

tp-2002-24

Abstract

The third chapter investigates measurement error in SIPP annual job earnings data linked to SSA administrative earnings data. The multiple earnings measures provided by the survey and administrative data enable the identification of components of true variation and variation due to measurement error. We find that 18% of the variation in SIPP annual job earnings can be attributed to measurement error. We also find that in both the SIPP and the DER, measurement error is persistent over time. A lower level of auto-correlation in the SIPP measurement error than in the economic error component leads to a lower reliability ratio of .62 for first-differenced earnings.

Document Tags and Keywords

Keywords:

estimation, economist, econometric, estimating, statistical, estimator, earnings, employed, yearly, statistician, regression, autoregressive, wage regressions, regressing, survey income, assessing, employment earnings

Tags:

Internal Revenue Service, Social Security Administration, National Science Foundation, Current Population Survey, Employer Identification Numbers, Survey of Income and Program Participation, Cornell University, Social Security, Social Security Number, Alfred P Sloan Foundation, PSID, LEHD Program, Business Register, Detailed Earnings Records, Computer Assisted Personal Interview, Master Earnings File

Similar Working Papers

The 10 most similar working papers to the working paper 'Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Survey and SSA Administrative Data' are listed below in order of similarity.

Working Paper
🔥

Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Bureau Survey and SSA Administrative Data

July 2011

Authors: John M. Abowd, Martha Stinson

Working Paper Number:

CES-11-20

We quantify sources of variation in annual job earnings data collected by the Survey of Income and Program Participation (SIPP) to determine how much of the variation is the result of measurement error. Jobs reported in the SIPP are linked to jobs reported in an administrative database, the Detailed Earnings Records (DER) drawn from the Social Security Administration's Master Earnings File, a universe file of all earnings reported on W-2 tax forms. As a result of the match, each job potentially has two earnings observations per year: survey and administrative. Unlike previous validation studies, both of these earnings measures are viewed as noisy measures of some underlying true amount of annual earnings. While the existence of survey error resulting from respondent mistakes or misinterpretation is widely accepted, the idea that administrative data are also error-prone is new. Possible sources of employer reporting error, employee under-reporting of compensation such as tips, and general differences between how earnings may be reported on tax forms and in surveys, necessitates the discarding of the assumption that administrative data are a true measure of the quantity that the survey was designed to collect. In addition, errors in matching SIPP and DER jobs, a necessary task in any use of administrative data, also contribute to measurement error in both earnings variables. We begin by comparing SIPP and DER earnings for different demographic and education groups of SIPP respondents. We also calculate different measures of changes in earnings for individuals switching jobs. We estimate a standard earnings equation model using SIPP and DER earnings and compare the resulting coefficients. Finally exploiting the presence of individuals with multiple jobs and shared employers over time, we estimate an econometric model that includes random person and firm effects, a common error component shared by SIPP and DER earnings, and two independent error components that represent the variation unique to each earnings measure. We compare the variance components from this model and consider how the DER and SIPP differ across unobservable components.
View Full Paper PDF
Working Paper
🔥

Estimating the Relationship between Employer-Provided Health Insurance, Worker Mobility, and Wages

September 2002

Authors: Martha Stinson

Working Paper Number:

tp-2002-23

In this paper, a joint model of wages, hazard of a job ending, and probability of holding employer-provided health insurance is estimated, taking account of unobservable person and job characteristics. A unique data source, the 1990 and 1996 SIPP Panels linked to SSA administrative job histories, enables the identification of random person and job effects and the correlation of these effects across the three equations. The explicit modeling of this correlation produces consistent estimates of the effect of tenure on wages and the effect of health insurance on mobility. Substantial levels of job-lock and significant annual returns to seniority are found. Increasing the job-specific probability of obtaining employerprovided health insurance from 60% to 63%, or increasing the job-specific hourly wage rate by $.80, are both associated with an equivalent decrease in the hazard of the job ending. However, the dollar value of the wage benefit is substantially higher.
View Full Paper PDF
Working Paper

The Sensitivity of Economic Statistics to Coding Errors in Personal Identifiers

October 2002

Authors: Lars Vilhuber, John M. Abowd

Working Paper Number:

tp-2002-17

In this paper, we describe the sensitivity of small-cell flow statistics to coding errors in the identity of the underlying entities. Specifically, we present results based on a comparison of the U.S. Census Bureau's Quarterly Workforce Indicators (QWI) before and after correcting for such errors in SSN-based identifiers in the underlying individual wage records. The correction used involves a novel application of existing statistical matching techniques. It is found that even a very conservative correction procedure has a sizable impact on the statistics. The average bias ranges from 0.25 percent up to 15 percent for flow statistics, and up to 5 percent for payroll aggregates.
View Full Paper PDF
Working Paper

Is it Who You Are, Where You Work, or With Whom You Work? Reassessing the Relationship Between Skill Segregation and Wage Inequality

June 2002

Authors: Paul A. Lengermann

Working Paper Number:

tp-2002-10

In a recent paper, Kremer & Maskin (QJE, forthcoming) develop an assignment model in which increases in the dispersion and mean of the skill distribution can lead simultaneously to increases in wage inequality and skill segregation. They then present evidence that, concurrent with rising wage inequality, wage segregation increased for production workers in the United States between 1975 and 1986. My paper argues that relying on wages as a proxy for skill may be problematic. Using a newly developed longitudinal dataset linking virtually the entire universe of workers in the state of Illinois to their employers, I decompose wages into components due, not only to person and firm heterogeneity, but also to the characteristics of their co-workers. Such "co-worker effects" capture the impact of a weighted sum of the characteristics of all workers in a firm on each individual employee's wage. While rising wage segregation can result from greater skill segregation, it may also be due to changes in the variance of co-worker effects in the economy, or to changes in the covariance between the person, firm, and co-worker components of wages. Due to the limited availability of demographic information on workers, I rely on the person specific component of wages to proxy for co-worker "skills." Because these person effects are unknown ex ante, I implement an iterative estimation approach where they are first obtained from a preliminary regression that excludes any role for co-workers. Because virtually all person and firm effects are identified, the approach yields consistent estimates of the co-worker parameters. My estimates imply that a one standard deviation increase in both a firm's average person effect and experience level is associated, on average, with wage increases of 3% to 5%. Firms that increase the wage premia they pay workers appear to do so in conjunction with upgrading worker quality. Interestingly, the average effect masks considerable variation in the relative importance of co-workers across industries. After allowing the co-worker parameters to vary across 2 digit industries, I find that industry average co-worker effects explain 26% of observed inter-industry wage differentials. Finally, I decompose the overall distribution of wages into components due to persons, firms, and coworkers. While co-worker effects do indeed serve to exacerbate wage inequality, the tendency for high and low skilled workers to sort non-randomly into firms plays a considerably more prominent role.
View Full Paper PDF
Working Paper

Estimating the "True" Cost of Job Loss: Evidence Using Matched Data from Califormia 1991-2000

June 2009

Authors: Andrew KG Hildreth, Till von Wachter, Elizabeth Handwerker

Working Paper Number:

CES-09-14

Estimates of the cost of job displacement from survey and administrative data differ markedly. This paper uses a unique match of data between the Displaced Worker Survey (DWS) and administrative wage records from California to examine the sources of this discrepancy. When we use similar estimation methods and account for measurement error in survey wages correlated with worker demographics, estimates of earnings losses at displacement are similar from both datasets and significantly larger than those based on the DWS alone. Also correcting for measurement errors in reported displacements suggests both sources of such estimates may yield lower bounds for the true cost of displacement.
View Full Paper PDF
Working Paper

Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning

November 2021

Authors: Kristin McCue, John M. Abowd, Matthew D. Shapiro, Trivellore Raghunathan, Margaret C. Levenstein, Joelle Abramowitz, Dhiren Patki, Ann M. Rodgers, Nada Wasi, Dawn Zinsser

Working Paper Number:

CES-21-35

This paper considers the problem of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across establishments is highly skewed. To address these difficulties, this paper develops a probabilistic record linkage methodology that combines machine learning (ML) with multiple imputation (MI). This ML-MI methodology is applied to link survey respondents in the Health and Retirement Study to their workplaces in the Census Business Register. The linked data reveal new evidence that non-sampling errors in household survey data are correlated with respondents' workplace characteristics.
View Full Paper PDF
Working Paper

Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data

March 2019

Authors: Kristin McCue, John M. Abowd, Matthew D. Shapiro, Trivellore Raghunathan, Margaret C. Levenstein, Joelle Abramowitz, Dhiren Patki, Ann M. Rodgers, Nada Wasi

Working Paper Number:

CES-19-08

This paper illustrates an application of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across firms is highly asymmetric. To address these difficulties, this paper uses a supervised machine learning model to probabilistically link survey respondents in the Health and Retirement Study (HRS) with employers and establishments in the Census Business Register (BR) to create a new data source which we call the CenHRS. Multiple imputation is used to propagate uncertainty from the linkage step into subsequent analyses of the linked data. The linked data reveal new evidence that survey respondents' misreporting and selective nonresponse about employer characteristics are systematically correlated with wages.
View Full Paper PDF
Working Paper

Comparing Measures of Earnings Instability Based on Survey and Adminstrative Reports

August 2010

Authors: Kristin McCue, Chinhui Juhn

Working Paper Number:

CES-10-15

In Celik, Juhn, McCue, and Thompson (2009), we found that estimated levels of earnings instability based on data from the Current Population Survey (CPS) and the Survey of Income and Program Participation (SIPP) were reasonably close to each other and to others' estimates from the Panel Study of Income Dynamics (PSID), but estimates from unemployment insurance (UI) earnings were much larger. Given that the UI data are from administrative records which are often posited to be more accurate than survey reports, this raises concerns that measures based on survey data understate true earnings instability. To address this, we use links between survey samples from the SIPP and UI earnings records in the LEHD database to identify sources of differences in work history and earnings information. Substantial work has been done comparing earnings levels from administrative records to those collected in the SIPP and CPS, but our understanding of earnings instability would benefit from further examination of differences across sources in the properties of changes in earnings. We first compare characteristics of the overall and matched samples to address issues of selection in the matching process. We then compare earnings levels and jobs in the SIPP and LEHD data to identify differences between them. Finally we begin to examine how such differences affect estimates of earnings instability. Our preliminary findings suggest that differences in earnings changes for those in the lower tail of the earnings distribution account for much of the difference in instability estimates.
View Full Paper PDF
Working Paper

Modeling Labor Markets with Heterogeneous Agents and Matches

May 2002

Authors: Simon Woodcock

Working Paper Number:

tp-2002-19

I present a matching model with heterogeneous workers, firms, and worker-fim matches. The model generalizes the seminal Jovanovic (1979) model to the case of heterogeneous agents. The equilibrium wage is linear in a person-specific component, a firm-specific component, and a match specific component that varies with tenure. Under certain conditions, the equilibrium wage takes a simpler structure where the match specific component does not vary with tenure. I discuss fixed- and mixedeffect methods for estimating wage models with this structure on longitudinal linked employer-employee data. The fixed effect specification relies on restrictive identification conditions, but is feasible for very large databases. The mixed model requires less restrictive identification conditions, but is feasible only on relatively small databases. Both the fixed and mixed models generate empirical person, firm, and match effects with characteristics that are consistent with predictions from the matching model; the mixed model moreso than the fixed model. Shortcomings of the fixed model appear to be artifacts of the identification conditions.
View Full Paper PDF
Working Paper

How long do early career decisions follow women? The impact of industry and firm size history on the gender and motherhood wage gaps

January 2018

Authors: Martha Stinson, Holly Monti, Lori Reeder

Working Paper Number:

CES-18-05

We add to the gender wage gap literature by considering how characteristics of past employers are correlated with current wages and whether differences between the work histories of men and women are related to the persistent gender wage gap. Our hypothesis is that women have spent less time over the course of their careers in higher paying industries and have less job- and industry-specific human capital and that these characteristics are correlated with male-female earnings differences. Additionally, we expect that difference in the work histories between women with children and childless women might help explain the observed motherhood wage gap. We use unique administrative employer history data to conduct a standard decomposition exercise to determine the impact of differences in observable job history characteristics on the gender and motherhood wage gaps. We find that industry work history has two opposing effects on both these wage gaps. The distribution of work experience across industries contributes to increasing the wage gaps, but the share of experience spent in the industry sector of the current job works to decrease earnings differences.
View Full Paper PDF

Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Survey and SSA Administrative Data

September 2002

Working Paper Number:

tp-2002-24

Abstract

Document Tags and Keywords

The 10 most similar working papers to the working paper 'Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Survey and SSA Administrative Data' are listed below in order of similarity.

July 2011

Working Paper Number:

CES-11-20

September 2002

Working Paper Number:

tp-2002-23

October 2002

Working Paper Number:

tp-2002-17

June 2002

Working Paper Number:

tp-2002-10

June 2009

Working Paper Number:

CES-09-14

November 2021

Working Paper Number:

CES-21-35

March 2019

Working Paper Number:

CES-19-08

August 2010

Working Paper Number:

CES-10-15

May 2002

Working Paper Number:

tp-2002-19

January 2018

Working Paper Number:

CES-18-05