CREAT: Census Research Exploration and Analysis Tool

Papers Containing Keywords(s): 'analysis'

The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
Click here to search again

Frequently Occurring Concepts within this Search

National Science Foundation - 19

Center for Economic Studies - 18

Service Annual Survey - 13

Longitudinal Employer Household Dynamics - 12

Cornell University - 11

Research Data Center - 11

Internal Revenue Service - 10

American Community Survey - 9

Bureau of Labor Statistics - 9

North American Industry Classification System - 8

Longitudinal Research Database - 8

Survey of Income and Program Participation - 7

Bureau of Economic Analysis - 7

Alfred P Sloan Foundation - 7

Longitudinal Business Database - 6

Census Bureau Disclosure Review Board - 6

2020 Census - 6

Special Sworn Status - 6

Cornell Institute for Social and Economic Research - 5

Federal Statistical Research Data Center - 5

Current Population Survey - 5

Decennial Census - 5

Quarterly Workforce Indicators - 5

Social Security Administration - 5

Census of Manufactures - 5

Economic Census - 5

Chicago Census Research Data Center - 5

LEHD Program - 5

Standard Industrial Classification - 5

Annual Survey of Manufactures - 5

Disclosure Review Board - 4

Statistics Canada - 4

Quarterly Census of Employment and Wages - 4

Master Address File - 4

Public Use Micro Sample - 4

National Bureau of Economic Research - 4

Total Factor Productivity - 4

National Center for Science and Engineering Statistics - 3

Office of Management and Budget - 3

National Academy of Sciences - 3

Agency for Healthcare Research and Quality - 3

National Institute on Aging - 3

International Trade Research Report - 3

National Center for Health Statistics - 3

Department of Health and Human Services - 3

Census Bureau Business Register - 3

Standard Statistical Establishment List - 3

Employer Identification Number - 3

Social Security - 3

Viewing papers 1 through 10 of 37


  • Working Paper

    Grassroots Design Meets Grassroots Innovation: Rural Design Orientation and Firm Performance

    March 2024

    Working Paper Number:

    CES-24-17

    The study of grassroots design'applying structured, creative processes to the usability or aesthetics of a product without input from professional design consultancies'remains under investigated. If design comprises a mediation between people and technology whereby technologies are made more accessible or more likely to delight, then the process by which new grassroots inventions are transformed into innovations valued in markets cannot be fully understood. This paper uses U.S. data on the design orientation of respondents in the 2014 Rural Establishment Innovation Survey linked to longitudinal data on the same firms to examine the association between design, innovation, and employment and payroll growth. Findings from the research will inform questions to be investigated in the recently collected 2022 Annual Business Survey (ABS) that for the first time contains a Design module.
    View Full Paper PDF
  • Working Paper

    The 2010 Census Confidentiality Protections Failed, Here's How and Why

    December 2023

    Working Paper Number:

    CES-23-63

    Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act.
    View Full Paper PDF
  • Working Paper

    An In-Depth Examination of Requirements for Disclosure Risk Assessment

    October 2023

    Working Paper Number:

    CES-23-49

    The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
    View Full Paper PDF
  • Working Paper

    Mixed-Effects Methods For Search and Matching Research

    September 2023

    Working Paper Number:

    CES-23-43

    We study mixed-effects methods for estimating equations containing person and firm effects. In economics such models are usually estimated using fixed-effects methods. Recent enhancements to those fixed-effects methods include corrections to the bias in estimating the covariance matrix of the person and firm effects, which we also consider.
    View Full Paper PDF
  • Working Paper

    Building the Census Bureau Index of Economic Activity (IDEA)

    March 2023

    Working Paper Number:

    CES-23-15

    The Census Bureau Index of Economic Activity (IDEA) is constructed from 15 of the Census Bureau's primary monthly economic time series. The index is intended to provide a single time series reflecting, to the extent possible, the variation over time in the whole set of component series. The component series provide monthly measures of activity in retail and wholesale trade, manufacturing, construction, international trade, and business formations. Most of the input series are Principal Federal Economic Indicators. The index is constructed by applying the method of principal components analysis (PCA) to the time series of monthly growth rates of the seasonally adjusted component series, after standardizing the growth rates to series with mean zero and variance 1. Similar PCA approaches have been used for the construction of other economic indices, including the Chicago Fed National Activity Index issued by the Federal Reserve Bank of Chicago, and the Weekly Economic Index issued by the Federal Reserve Bank of New York. While the IDEA is constructed from time series of monthly data, it is calculated and published every business day, and so is updated whenever a new monthly value is released for any of its component series. Since release dates of data values for a given month vary across the component series, with slight variations in the monthly release date for any one component series, updates to the index are frequent. It is unavoidably the case that, at almost all updates, some of the component series lack observations for the current (most recent) data month. To address this situation, component series that are one month behind are predicted (nowcast) for the current index month, using a multivariate autoregressive time series model. This report discusses the input series to the index, the construction of the index by PCA, and the nowcasting procedure used. The report then examines some properties of the index and its relation to quarterly U.S. Gross Domestic Product and to some monthly non-Census Bureau economic indicators.
    View Full Paper PDF
  • Working Paper

    Registered Report: Exploratory Analysis of Ownership Diversity and Innovation in the Annual Business Survey

    March 2023

    Authors: Timothy R. Wojan

    Working Paper Number:

    CES-23-11

    A lack of transparency in specification testing is a major contributor to the replicability crisis that has eroded the credibility of findings for informing policy. How diversity is associated with outcomes of interest is particularly susceptible to the production of nonreplicable findings given the very large number of alternative measures applied to several policy relevant attributes such as race, ethnicity, gender, or foreign-born status. The very large number of alternative measures substantially increases the probability of false discovery where nominally significant parameter estimates'selected through numerous though unreported specification tests'may not be representative of true associations in the population. The purpose of this registered report is to: 1) select a single measure of ownership diversity that satisfies explicit, requisite axioms; 2) split the Annual Business Survey (ABS) into an exploratory sample (35%) used in this analysis and a confirmatory sample (65%) that will be accessed only after the publication of this report; 3) regress self-reported new-to-market innovation on the diversity measure along with industry and firm-size controls; 4) pass through those variables meeting precision and magnitude criteria for hypothesis testing using the confirmatory sample; and 5) document the full set of hypotheses to be tested in the final analysis along with a discussion of the false discovery and family-wise error rate corrections to be applied. The discussion concludes with the added value of implementing split sample designs within the Federal Statistical Research Data Center system where access to data is strictly controlled.
    View Full Paper PDF
  • Working Paper

    Some Open Questions on Multiple-Source Extensions of Adaptive-Survey Design Concepts and Methods

    February 2023

    Working Paper Number:

    CES-23-03

    Adaptive survey design is a framework for making data-driven decisions about survey data collection operations. This paper discusses open questions related to the extension of adaptive principles and capabilities when capturing data from multiple data sources. Here, the concept of 'design' encompasses the focused allocation of resources required for the production of high-quality statistical information in a sustainable and cost-effective way. This conceptual framework leads to a discussion of six groups of issues including: (i) the goals for improvement through adaptation; (ii) the design features that are available for adaptation; (iii) the auxiliary data that may be available for informing adaptation; (iv) the decision rules that could guide adaptation; (v) the necessary systems to operationalize adaptation; and (vi) the quality, cost, and risk profiles of the proposed adaptations (and how to evaluate them). A multiple data source environment creates significant opportunities, but also introduces complexities that are a challenge in the production of high-quality statistical information.
    View Full Paper PDF
  • Working Paper

    Total Error and Variability Measures for the Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in OnTheMap

    September 2020

    Working Paper Number:

    CES-20-30

    We report results from the first comprehensive total quality evaluation of five major indicators in the U.S. Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) Program Quarterly Workforce Indicators (QWI): total flow-employment, beginning-of-quarter employment, full quarter employment, average monthly earnings of full-quarter employees, and total quarterly payroll. Beginning-of-quarter employment is also the main tabulation variable in the LEHD Origin-Destination Employment Statistics (LODES) workplace reports as displayed in On-TheMap (OTM), including OnTheMap for Emergency Management. We account for errors due to coverage; record-level non response; edit and imputation of item missing data; and statistical disclosure limitation. The analysis reveals that the five publication variables under study are estimated very accurately for tabulations involving at least 10 jobs. Tabulations involving three to nine jobs are a transition zone, where cells may be fit for use with caution. Tabulations involving one or two jobs, which are generally suppressed on fitness-for-use criteria in the QWI and synthesized in LODES, have substantial total variability but can still be used to estimate statistics for untabulated aggregates as long as the job count in the aggregate is more than 10.
    View Full Paper PDF
  • Working Paper

    Total Error and Variability Measures with Integrated Disclosure Limitation for Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in On The Map

    January 2017

    Working Paper Number:

    CES-17-71

    We report results from the rst comprehensive total quality evaluation of five major indicators in the U.S. Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) Program Quarterly Workforce Indicators (QWI): total employment, beginning-of-quarter employment, full-quarter employment, total payroll, and average monthly earnings of full-quarter employees. Beginning-of-quarter employment is also the main tabulation variable in the LEHD Origin-Destination Employment Statistics (LODES) workplace reports as displayed in OnTheMap (OTM). The evaluation is conducted by generating multiple threads of the edit and imputation models used in the LEHD Infrastructure File System. These threads conform to the Rubin (1987) multiple imputation model, with each thread or implicate being the output of formal probability models that address coverage, edit, and imputation errors. Design-based sampling variability and nite population corrections are also included in the evaluation. We derive special formulas for the Rubin total variability and its components that are consistent with the disclosure avoidance system used for QWI and LODES/OTM workplace reports. These formulas allow us to publish the complete set of detailed total quality measures for QWI and LODES. The analysis reveals that the five publication variables under study are estimated very accurately for tabulations involving at least 10 jobs. Tabulations involving three to nine jobs have quality in the range generally deemed acceptable. Tabulations involving zero, one or two jobs, which are generally suppressed in the QWI and synthesized in LODES, have substantial total variability but their publication in LODES allows the formation of larger custom aggregations, which will in general have the accuracy estimated for tabulations in the QWI based on a similar number of workers.
    View Full Paper PDF
  • Working Paper

    Considering the Use of Stock and Flow Outcomes in Empirical Analyses: An Examination of Marriage Data

    January 2017

    Working Paper Number:

    CES-17-64

    This paper fills an important void assessing how the use of stock outcomes as compared to flow outcomes may yield disparate results in empirical analyses, despite often being used interchangeably. We compare analyses using a stock outcome, marital status, to those using a flow outcome, entry into marriage, from the same dataset, the American Community Survey. This paper considers two different questions and econometric approaches using these alternative measures: the effect of the Affordable Care Act young adult provision on marriage using a difference-indifferences approach and the relationship between aggregate unemployment rates and marriage rates using a simpler ordinary least squares regression approach. Results from both analyses show stock and flow data yield divergent results in terms of sign and significance. Additional analyses suggest prior-period temporary shocks and migration may contribute to this discrepancy. These results suggest using caution when conducting analyses using stock data as they may produce false negative results or spurious false positive results, which could in turn give rise to misleading policy implications.
    View Full Paper PDF