CREAT: Census Research Exploration and Analysis Tool

Papers written by Author(s): 'Matthew D. Shapiro'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

Viewing papers 1 through 6 of 6


  • Working Paper

    Quality Adjustment at Scale: Hedonic vs. Exact Demand-Based Price Indices

    June 2023

    Working Paper Number:

    CES-23-26

    This paper explores alternative methods for adjusting price indices for quality change at scale. These methods can be applied to large-scale item-level transactions data that in cludes information on prices, quantities, and item attributes. The hedonic methods can take into account the changing valuations of both observable and unobservable charac teristics in the presence of product turnover. The paper also considers demand-based approaches that take into account changing product quality from product turnover and changing appeal of continuing products. The paper provides evidence of substantial quality-adjustment in prices for a wide range of goods, including both high-tech consumer products and food products.
    View Full Paper PDF
  • Working Paper

    Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning

    November 2021

    Working Paper Number:

    CES-21-35

    This paper considers the problem of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across establishments is highly skewed. To address these difficulties, this paper develops a probabilistic record linkage methodology that combines machine learning (ML) with multiple imputation (MI). This ML-MI methodology is applied to link survey respondents in the Health and Retirement Study to their workplaces in the Census Business Register. The linked data reveal new evidence that non-sampling errors in household survey data are correlated with respondents' workplace characteristics.
    View Full Paper PDF
  • Working Paper

    Re-engineering Key National Economic Indicators

    July 2019

    Working Paper Number:

    CES-19-22

    Traditional methods of collecting data from businesses and households face increasing challenges. These include declining response rates to surveys, increasing costs to traditional modes of data collection, and the difficulty of keeping pace with rapid changes in the economy. The digitization of virtually all market transactions offers the potential for re-engineering key national economic indicators. The challenge for the statistical system is how to operate in this data-rich environment. This paper focuses on the opportunities for collecting item-level data at the source and constructing key indicators using measurement methods consistent with such a data infrastructure. Ubiquitous digitization of transactions allows price and quantity be collected or aggregated simultaneously at the source. This new architecture for economic statistics creates challenges arising from the rapid change in items sold. The paper explores some recently proposed techniques for estimating price and quantity indices in large scale item-level data. Although those methods display tremendous promise, substantially more research is necessary before they will be ready to serve as the basis for the official economic statistics. Finally, the paper addresses implications for building national statistics from transactions for data collection and for the capabilities and organization of the statistical agencies in the 21st century.
    View Full Paper PDF
  • Working Paper

    Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data

    March 2019

    Working Paper Number:

    CES-19-08

    This paper illustrates an application of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across firms is highly asymmetric. To address these difficulties, this paper uses a supervised machine learning model to probabilistically link survey respondents in the Health and Retirement Study (HRS) with employers and establishments in the Census Business Register (BR) to create a new data source which we call the CenHRS. Multiple imputation is used to propagate uncertainty from the linkage step into subsequent analyses of the linked data. The linked data reveal new evidence that survey respondents' misreporting and selective nonresponse about employer characteristics are systematically correlated with wages.
    View Full Paper PDF
  • Working Paper

    Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the U.S. Statistical System?

    January 2017

    Working Paper Number:

    CES-17-59R

    The National Science Foundation-Census Bureau Research Network (NCRN) was established in 2011 to create interdisciplinary research nodes on methodological questions of interest and significance to the broader research community and to the Federal Statistical System (FSS), particularly the Census Bureau. The activities to date have covered both fundamental and applied statistical research and have focused at least in part on the training of current and future generations of researchers in skills of relevance to surveys and alternative measurement of economic units, households, and persons. This paper discusses some of the key research findings of the eight nodes, organized into six topics: (1) Improving census and survey data collection methods; (2) Using alternative sources of data; (3) Protecting privacy and confidentiality by improving disclosure avoidance; (4) Using spatial and spatio-temporal statistical modeling to improve estimates; (5) Assessing data cost and quality tradeoffs; and (6) Combining information from multiple sources. It also reports on collaborations across nodes and with federal agencies, new software developed, and educational activities and outcomes. The paper concludes with an evaluation of the ability of the FSS to apply the NCRN's research outcomes and suggests some next steps, as well as the implications of this research-network model for future federal government renewal initiatives.
    View Full Paper PDF
  • Working Paper

    Using the Survey of Plant Capacity to Measure Capital Utilization

    July 2011

    Working Paper Number:

    CES-11-19

    Most capital in the United States is idle much of the time. By some measures, the average workweek of capital in U.S. manufacturing is as low as 55 hours per 168 hour week. The level and variability of capital utilization has important implications for understanding both the level of production and its cyclical fluctuations. This paper investigates a number of issues relating to aggregation of capital utilization measures from the Survey of Plant Capacity and makes recommendations on expanding and improving the published statistics deriving from the Survey of Plant Capacity. The paper documents a number of facts about properties of capital utilization. First, after growing for decades, capital utilization started to fall in mid 1990s. Second, capital utilization is a useful predictor of changes in capacity utilization and other factors of production. Third, adjustment of productivity measures for variable capital utilization improves statistical and economic properties of these measures. Fourth, the paper constructs weights to aggregate firm level capital utilization rates to industry and economy level, which is the major enhancement to available data.
    View Full Paper PDF