The Census Bureau collects industry information through surveys and administrative data and creates associated public-use statistics. In this paper, we compare person-reported industry in the American Community Survey (ACS) to employer-reported industry from the Quarterly Census of Employment and Wages (QCEW) that is part of the Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) program. This research provides necessary information on the use of administrative data as a supplement to survey data industry information, and the findings will be useful for anyone using industry information from either source. Our project is part of a larger effort to compare information on jobs from household survey data to employer-reported information. This research is the first to compare ACS job data to firm-based administrative data. We find an overall industry sector match rate of 75 percent, and a 61 percent match rate at the 4-digit Census Industry Code (CIC) level. Industry match rates vary by sector and by whether industry sector is classified using ACS or LEHD industry information. The educational services and health care and social assistance sectors have among the highest match rates. The management of companies and enterprises sector has the lowest match rate, using either ACS-reported or LEHD-reported sector. For individuals with imputed industry data, the industry sector match rate is only 14 percent. Our findings suggest that the industry distribution and the sample in a particular industry sector will differ depending on whether ACS or LEHD data are used.
-
Social, Economic, Spatial, and Commuting Patterns of Dual Jobholders
April 2007
Working Paper Number:
tp-2007-01
Individuals who hold multiple jobs have complex working lives and complex commuting
patterns. Economic and spatial information on these individuals is not readily available in
standard datasets, such as the 2000 Decennial Census Long Form, because the survey questions
were not designed to collect details on multiple jobs. This study takes advantage of firm-based
data from the Unemployment Insurance administrative wage records, linked with the Census
Bureau's household-based data, to examine multiple jobholders - and specifically a sentinel
group of dual jobholders. The study uses a sample from Los Angeles County, California and
examines the dual jobholders by their demographic characteristics as well as their economic,
commuting, and spatial location outcomes. In addition this report evaluates whether multiple
jobholders should be included explicitly in future labor-workforce analyses and transportation
modeling.
View Full
Paper PDF
-
Social, Economic, Spatial, and Commuting Patterns of Informal Jobholders
April 2007
Working Paper Number:
tp-2007-02
A significant number of employees within the United States can be considered "informal" or
"off-the-books" workers. These workers, who by definition do not appear in administrative wage
records, are distinct from the larger group of private jobholders who do appear in administrative
records. However, while socioeconomic and spatial information on these individuals is readily
available in standard datasets, such as the 2000 Decennial Census Long Form, it is not possible
to identify the informal workers by only using such data because of the lack of accurate, formal
wage records. This study takes advantage of firm-based data that originates in Unemployment
Insurance administrative wage records linked with the Census Bureau's household-based data in
order to examine informal jobholders by their demographic characteristics as well as their
economic, commuting, and spatial location outcomes. In addition this report evaluates whether
informal jobholders should be included explicitly in future labor-workforce analyses and
transportation modeling. The analyses in this report use the sample of workers who lived in Los
Angeles County, California.
View Full
Paper PDF
-
National Experimental Wellbeing Statistics - Version 1
February 2023
Working Paper Number:
CES-23-04
This is the U.S. Census Bureau's first release of the National Experimental Wellbeing Statistics (NEWS) project. The NEWS project aims to produce the best possible estimates of income and poverty given all available survey and administrative data. We link survey, decennial census, administrative, and third-party data to address measurement error in income and poverty statistics. We estimate improved (pre-tax money) income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research. We address biases from 1) unit nonresponse through improved weights, 2) missing income information in both survey and administrative data through improved imputation, and 3) misreporting by combining or replacing survey responses with administrative information. Reducing survey error substantially affects key measures of well-being: We estimate median household income is 6.3 percent higher than in survey estimates, and poverty is 1.1 percentage points lower. These changes are driven by subpopulations for which survey error is particularly relevant. For house holders aged 65 and over, median household income is 27.3 percent higher and poverty is 3.3 percentage points lower than in survey estimates. We do not find a significant impact on median household income for householders under 65 or on child poverty. Finally, we discuss plans for future releases: addressing other potential sources of bias, releasing additional years of statistics, extending the income concepts measured, and including smaller geographies such as state and county.
View Full
Paper PDF
-
Two Perspectives on Commuting: A Comparison of Home to Work Flows Across Job-Linked Survey and Administrative Files
January 2017
Working Paper Number:
CES-17-34
Commuting flows and workplace employment data have a wide constituency of users including urban and regional planners, social science and transportation researchers, and businesses. The U.S. Census Bureau releases two, national data products that give the magnitude and characteristics of home to work flows. The American Community Survey (ACS) tabulates households' responses on employment, workplace, and commuting behavior. The Longitudinal Employer-Household Dynamics (LEHD) program tabulates administrative records on jobs in the LEHD Origin-Destination Employment Statistics (LODES). Design differences across the datasets lead to divergence in a comparable statistic: county-to-county aggregate commute flows. To understand differences in the public use data, this study compares ACS and LEHD source files, using identifying information and probabilistic matching to join person and job records. In our assessment, we compare commuting statistics for job frames linked on person, employment status, employer, and workplace and we identify person and job characteristics as well as design features of the data frames that explain aggregate differences. We find a lower rate of within-county commuting and farther commutes in LODES. We attribute these greater distances to differences in workplace reporting and to uncertainty of establishment assignments in LEHD for workers at multi-unit employers. Minor contributing factors include differences in residence location and ACS workplace edits. The results of this analysis and the data infrastructure developed will support further work to understand and enhance commuting statistics in both datasets.
View Full
Paper PDF
-
Comparing Measures of Earnings Instability Based on Survey and Adminstrative Reports
August 2010
Working Paper Number:
CES-10-15
In Celik, Juhn, McCue, and Thompson (2009), we found that estimated levels of earnings instability based on data from the Current Population Survey (CPS) and the Survey of Income and Program Participation (SIPP) were reasonably close to each other and to others' estimates from the Panel Study of Income Dynamics (PSID), but estimates from unemployment insurance (UI) earnings were much larger. Given that the UI data are from administrative records which are often posited to be more accurate than survey reports, this raises concerns that measures based on survey data understate true earnings instability. To address this, we use links between survey samples from the SIPP and UI earnings records in the LEHD database to identify sources of differences in work history and earnings information. Substantial work has been done comparing earnings levels from administrative records to those collected in the SIPP and CPS, but our understanding of earnings instability would benefit from further examination of differences across sources in the properties of changes in earnings. We first compare characteristics of the overall and matched samples to address issues of selection in the matching process. We then compare earnings levels and jobs in the SIPP and LEHD data to identify differences between them. Finally we begin to examine how such differences affect estimates of earnings instability. Our preliminary findings suggest that differences in earnings changes for those in the lower tail of the earnings distribution account for much of the difference in instability estimates.
View Full
Paper PDF
-
Successor/Predecessor Firms
March 2002
Working Paper Number:
tp-2002-04
The goal of this research was to investigate the value added from using worker flows to identify the spurious births and deaths of businesses. We identify four types of "at risk" businesses from ES202 using the successor/predecessor flag and mimic the same categories using UI wage record data. We use two critical decision rules in the analysis: a successor firm has to have at least 80% of employment coming from the donor firm and (in two of the four categories) at least 5 employees have to come from the donor firm. We examine the sensitivity of the categories based on the percentage definition, and find that the results stay very similar, with the exception of the identification of the pure successor. We examine the sensitivity based on the count threshold, and find that there are enormous differences, particularly with identifying spinoff businesses.
View Full
Paper PDF
-
A New Measure of Multiple Jobholding in the U.S. Economy
September 2020
Working Paper Number:
CES-20-26
We create a measure of multiple jobholding from the U.S. Census Bureau's Longitudinal Employer-Household Dynamics data. This new series shows that 7.8 percent of persons in the U.S. are multiple jobholders, this percentage is pro-cyclical, and has been trending upward during the past twenty years. The data also show that earnings from secondary jobs are, on average, 27.8 percent of a multiple jobholder's total quarterly earnings. Multiple jobholding occurs at all levels of earnings, with both higher- and lower-earnings multiple jobholders earning more than 25 percent of their total earnings from multiple jobs. These new statistics tell us that multiple jobholding is more important in the U.S. economy than we knew.
View Full
Paper PDF
-
Nonresponse and Coverage Bias in the Household Pulse Survey: Evidence from Administrative Data
October 2024
Working Paper Number:
CES-24-60
The Household Pulse Survey (HPS) conducted by the U.S. Census Bureau is a unique survey that provided timely data on the effects of the COVID-19 Pandemic on American households and continues to provide data on other emergent social and economic issues. Because the survey has a response rate in the single digits and only has an online response mode, there are concerns about nonresponse and coverage bias. In this paper, we match administrative data from government agencies and third-party data to HPS respondents to examine how representative they are of the U.S. population. For comparison, we create a benchmark of American Community Survey (ACS) respondents and nonrespondents and include the ACS respondents as another point of reference. Overall, we find that the HPS is less representative of the U.S. population than the ACS. However, performance varies across administrative variables, and the existing weighting adjustments appear to greatly improve the representativeness of the HPS. Additionally, we look at household characteristics by their email domain to examine the effects on coverage from limiting email messages in 2023 to addresses from the contact frame with at least 90% deliverability rates, finding no clear change in the representativeness of the HPS afterwards.
View Full
Paper PDF
-
Social, Economic, Spatial, and Commuting Patterns of Self-Employed Jobholders
April 2007
Working Paper Number:
tp-2007-03
A significant number of employees within the United States identify themselves as selfemployed,
and they are distinct from the larger group identified as private jobholders. While
socioeconomic and spatial information on these individuals is readily available in standard
datasets, such as the 2000 Decennial Census Long Form, it is possible to gain further information
on their wage earnings by using data from administrative wage records. This study takes
advantage of firm-based data from Unemployment Insurance administrative wage records linked
with the Census Bureau's household-based data in order to examine self-employed jobholders -
both as a whole and as subgroups defined according to their earned wage status - by their
demographic characteristics as well as their economic, commuting, and spatial location
outcomes. Additionally, this report evaluates whether self-employed jobholders and the defined
subgroups should be included explicitly in future labor-workforce analyses and transportation
modeling. The analyses in this report use the sample of self-employed workers who lived in Los
Angeles County, California.
View Full
Paper PDF
-
Incorporating Administrative Data in Survey Weights for the Basic Monthly Current Population Survey
January 2024
Working Paper Number:
CES-24-02
Response rates to the Current Population Survey (CPS) have declined over time, raising the potential for nonresponse bias in key population statistics. A potential solution is to leverage administrative data from government agencies and third-party data providers when constructing survey weights. In this paper, we take two approaches. First, we use administrative data to build a non-parametric nonresponse adjustment step while leaving the calibration to population estimates unchanged. Second, we use administratively linked data in the calibration process, matching income data from the Internal Return Service and state agencies, demographic data from the Social Security Administration and the decennial census, and industry data from the Census Bureau's Business Register to both responding and nonresponding households. We use the matched data in the household nonresponse adjustment of the CPS weighting algorithm, which changes the weights of respondents to account for differential nonresponse rates among subpopulations.
After running the experimental weighting algorithm, we compare estimates of the unemployment rate and labor force participation rate between the experimental weights and the production weights. Before March 2020, estimates of the labor force participation rates using the experimental weights are 0.2 percentage points higher than the original estimates, with minimal effect on unemployment rate. After March 2020, the new labor force participation rates are similar, but the unemployment rate is about 0.2 percentage points higher in some months during the height of COVID-related interviewing restrictions. These results are suggestive that if there is any nonresponse bias present in the CPS, the magnitude is comparable to the typical margin of error of the unemployment rate estimate. Additionally, the results are overall similar across demographic groups and states, as well as using alternative weighting methodology. Finally, we discuss how our estimates compare to those from earlier papers that calculate estimates of bias in key CPS labor force statistics.
This paper is for research purposes only. No changes to production are being implemented at this time.
View Full
Paper PDF