CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'employee data'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Longitudinal Employer Household Dynamics - 21

Bureau of Labor Statistics - 13

National Science Foundation - 13

Cornell University - 13

Internal Revenue Service - 11

Employer Identification Numbers - 11

Alfred P Sloan Foundation - 11

Quarterly Workforce Indicators - 11

Current Population Survey - 11

Social Security Administration - 10

Center for Economic Studies - 10

Research Data Center - 10

Quarterly Census of Employment and Wages - 10

Unemployment Insurance - 9

Social Security Number - 9

Survey of Income and Program Participation - 9

Standard Industrial Classification - 8

Service Annual Survey - 8

Protected Identification Key - 8

Longitudinal Business Database - 7

North American Industry Classification System - 7

National Institute on Aging - 7

Employer Characteristics File - 7

Local Employment Dynamics - 7

Business Register - 7

American Community Survey - 6

Employment History File - 6

Individual Characteristics File - 6

Core Based Statistical Area - 6

Office of Personnel Management - 6

Decennial Census - 6

Social Security - 6

Metropolitan Statistical Area - 5

Successor Predecessor File - 5

Business Employment Dynamics - 5

Business Dynamics Statistics - 5

LEHD Program - 5

University of Chicago - 4

Master Address File - 4

Federal Tax Information - 4

Standard Statistical Establishment List - 4

Disclosure Review Board - 4

National Bureau of Economic Research - 4

University of Michigan - 4

PSID - 4

Cornell Institute for Social and Economic Research - 4

Composite Person Record - 3

Person Validation System - 3

Federal Statistical Research Data Center - 3

American Economic Association - 3

Review of Economics and Statistics - 3

American Economic Review - 3

Journal of Labor Economics - 3

Business Master File - 3

Sloan Foundation - 3

American Housing Survey - 3

Business Register Bridge - 3

Probability Density Function - 3

Department of Labor - 3

National Longitudinal Survey of Youth - 3

Department of Economics - 3

Survey of Consumer Finances - 3

CDF - 3

Cumulative Density Function - 3

Viewing papers 11 through 20 of 23


  • Working Paper

    Estimation of Job-to-Job Flow Rates under Partially Missing Geography

    September 2012

    Working Paper Number:

    CES-12-29

    Integration of data from different regions presents challenges for the calculation of entitylevel longitudinal statistics with a strong geographic component: for example, movements between employers, migration, business dynamics, and health statistics. In this paper, we consider the estimation of worker-level employment statistics when the geographies (in our application, US states) over which such measures are defined are partially missing. We focus on the recent pilot set of job-to-job flow statistics produced by the US Census Bureau's Longitudinal Employer- Household Dynamics (LEHD) program, which measure the frequency of worker movements between jobs and into and out of nonemployment. LEHD's coverage of the labor force gradually increases during the 1990s and 2000s because some states have a longer time series than others, so employment transitions involving missing states are only partially or not at all observed. We propose and implement a method for estimating national-level job-to-job flow statistics that involves dropping observed states to recover the relationship between missing states and directly tabulated job-to-job flow rates. Using the estimated relationship between the observable characteristics of the missing states and changes in the employment measures, we provide estimates of the rates of job-to-job, and job-to-nonemployment, job-to-nonemploymentto- job flows were all states uniformly available.
    View Full Paper PDF
  • Working Paper

    Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Bureau Survey and SSA Administrative Data

    July 2011

    Working Paper Number:

    CES-11-20

    We quantify sources of variation in annual job earnings data collected by the Survey of Income and Program Participation (SIPP) to determine how much of the variation is the result of measurement error. Jobs reported in the SIPP are linked to jobs reported in an administrative database, the Detailed Earnings Records (DER) drawn from the Social Security Administration's Master Earnings File, a universe file of all earnings reported on W-2 tax forms. As a result of the match, each job potentially has two earnings observations per year: survey and administrative. Unlike previous validation studies, both of these earnings measures are viewed as noisy measures of some underlying true amount of annual earnings. While the existence of survey error resulting from respondent mistakes or misinterpretation is widely accepted, the idea that administrative data are also error-prone is new. Possible sources of employer reporting error, employee under-reporting of compensation such as tips, and general differences between how earnings may be reported on tax forms and in surveys, necessitates the discarding of the assumption that administrative data are a true measure of the quantity that the survey was designed to collect. In addition, errors in matching SIPP and DER jobs, a necessary task in any use of administrative data, also contribute to measurement error in both earnings variables. We begin by comparing SIPP and DER earnings for different demographic and education groups of SIPP respondents. We also calculate different measures of changes in earnings for individuals switching jobs. We estimate a standard earnings equation model using SIPP and DER earnings and compare the resulting coefficients. Finally exploiting the presence of individuals with multiple jobs and shared employers over time, we estimate an econometric model that includes random person and firm effects, a common error component shared by SIPP and DER earnings, and two independent error components that represent the variation unique to each earnings measure. We compare the variance components from this model and consider how the DER and SIPP differ across unobservable components.
    View Full Paper PDF
  • Working Paper

    LEHD Infrastructure Files in the Census RDC: Overview of S2004 Snapshot

    April 2011

    Working Paper Number:

    CES-11-13

    The Longitudinal Employer-Household Dynamics (LEHD) Program at the U.S. Census Bureau, with the support of several national research agencies, has built a set of infrastructure files using administrative data provided by state agencies, enhanced with information from other administrative data sources, demographic and economic (business) surveys and censuses. The LEHD Infrastructure Files provide a detailed and comprehensive picture of workers, employers, and their interaction in the U.S. economy. This document describes the structure and content of the 2004 Snapshot of the LEHD Infrastructure files as they are made available in the Census Bureau's Research Data Center network.
    View Full Paper PDF
  • Working Paper

    Exploring Differences in Employment between Household and Establishment Data

    April 2009

    Working Paper Number:

    CES-09-09

    Using a large data set that links individual Current Population Survey (CPS) records to employer-reported administrative data, we document substantial discrepancies in basic measures of employment status that persist even after controlling for known definitional differences between the two data sources. We hypothesize that reporting discrepancies should be most prevalent for marginal workers and marginal jobs, and find systematic associations between the incidence of reporting discrepancies and observable person and job characteristics that are consistent with this hypothesis. The paper discusses the implications of the reported findings for both micro and macro labor market analysis
    View Full Paper PDF
  • Working Paper

    Access Methods for United States Microdata

    August 2007

    Working Paper Number:

    CES-07-25

    Beyond the traditional methods of tabulations and public-use microdata samples, statistical agencies have developed four key alternatives for providing non-government researchers with access to confidential microdata to improve statistical modeling. The first, licensing, allows qualified researchers access to confidential microdata at their own facilities, provided certain security requirements are met. The second, statistical data enclaves, offer qualified researchers restricted access to confidential economic and demographic data at specific agency-controlled locations. Third, statistical agencies can offer remote access, through a computer interface, to the confidential data under automated or manual controls. Fourth, synthetic data developed from the original data but retaining the correlations in the original data have the potential for allowing a wide range of analyses.
    View Full Paper PDF
  • Working Paper

    Distribution Preserving Statistical Disclosure Limitation

    September 2006

    Working Paper Number:

    tp-2006-04

    One approach to limiting disclosure risk in public-use microdata is to release multiply-imputed, partially synthetic data sets. These are data on actual respondents, but with confidential data replaced by multiply-imputed synthetic values. A mis-specified imputation model can invalidate inferences because the distribution of synthetic data is completely determined by the model used to generate them. We present two practical methods of generating synthetic values when the imputer has only limited information about the true data generating process. One is applicable when the true likelihood is known up to a monotone transformation. The second requires only limited knowledge of the true likelihood, but nevertheless preserves the conditional distribution of the confidential data, up to sampling error, on arbitrary subdomains. Our method maximizes data utility and minimizes incremental disclosure risk up to posterior uncertainty in the imputation model and sampling error in the estimated transformation. We validate the approach with a simulation and application to a large linked employer-employee database.
    View Full Paper PDF
  • Working Paper

    Integrated Longitudinal Employee-Employer Data for the United States

    May 2004

    Working Paper Number:

    tp-2004-02

    View Full Paper PDF
  • Working Paper

    The 1990 Decennial Employer-Employee Dataset

    October 2002

    Working Paper Number:

    CES-02-23

    We describe the construction and assessment of a new matched employer-employee data set, the 1990 Decennial Employer-Employee Dataset (1990 DEED). By using place of work name and address, we link workers from the 1990 Long Form Sample to their place of work in the 1990 Standard Statistical Establishment List. The resulting data set is much larger and more representative across regional and industry dimensions than previous matched data sets for the United States. The known strengths and limitations of the data set are discussed in detail.
    View Full Paper PDF
  • Working Paper

    Agent Heterogeneity and Learning: An Application to Labor Markets

    October 2002

    Authors: Simon Woodcock

    Working Paper Number:

    tp-2002-20

    I develop a matching model with heterogeneous workers, rms, and worker-firm matches, and apply it to longitudinal linked data on employers and employees. Workers vary in their marginal product when employed and their value of leisure when unemployed. Firms vary in their marginal product and cost of maintaining a vacancy. The marginal product of a worker-firm match also depends on a match-specific interaction between worker and rm that I call match quality. Agents have complete information about worker and rm heterogeneity, and symmetric but incomplete information about match quality. They learn its value slowly by observing production outcomes. There are two key results. First, under a Nash bargain, the equilibrium wage is linear in a person-specific component, a firm-specific component, and the posterior mean of beliefs about match quality. Second, in each period the separation decision depends only on the posterior mean of beliefs and person and rm characteristics. These results have several implications for an empirical model of earnings with person and rm e ects. The rst implies that residuals within a worker-firm match are a martingale; the second implies the distribution of earnings is truncated. I test predictions from the matching model using data from the Longitudinal Employer-Household Dynamics (LEHD) Program at the US Census Bureau. I present both xed and mixed model specifications of the equilibrium wage function, taking account of structural aspects implied by the learning process. In the most general specification, earnings residuals have a completely unstructured covariance within a worker-firm match. I estimate and test a variety of more parsimonious error structures, including the martingale structure implied by the learning process. I nd considerable support for the matching model in these data.
    View Full Paper PDF
  • Working Paper

    The Sensitivity of Economic Statistics to Coding Errors in Personal Identifiers

    October 2002

    Working Paper Number:

    tp-2002-17

    In this paper, we describe the sensitivity of small-cell flow statistics to coding errors in the identity of the underlying entities. Specifically, we present results based on a comparison of the U.S. Census Bureau's Quarterly Workforce Indicators (QWI) before and after correcting for such errors in SSN-based identifiers in the underlying individual wage records. The correction used involves a novel application of existing statistical matching techniques. It is found that even a very conservative correction procedure has a sizable impact on the statistics. The average bias ranges from 0.25 percent up to 15 percent for flow statistics, and up to 5 percent for payroll aggregates.
    View Full Paper PDF