By exploiting establishment-level data for U.S. manufacturing, this paper sheds new light on the source of the changes in the structure of production, wages, and employment that have occurred over the last several decades. Based on recent theoretical work by Caselli (1999) and Kremer and Maskin (1996), we focus on empirically investigating the following two hypotheses. The first hypothesis is that the channel through which skill biased technical change works through the economy is via changes in the dispersion in wages and productivity across establishments. The second is that the increased dispersion in wages and productivity across establishments is linked to differential rates of technological adoption across establishments. We find empirical support for these hypotheses. Our main findings are that (1) the between plant component of wage dispersion is an important and growing part of total wage dispersion, (2) much of the between plant increase in dispersion is within industries, (3) the between plant measures of wage and productivity dispersion have indreased substantially over the last few decades, (4) industries with large changes in between plant wage dispersion also exhibit large changes n between plant productivity dispersion, (5) a substantial fraction of the rising dispersion in wages and productivity is accounted for by increasing wage and productivity differentials across high and low computer investment per worker plants and high and low capital intensity plants, and (6) Changes in dispersion accounted for by such observable characteristics yield predicted industry level changes in wage and productivity dispersion that are highly correlated.
-
Applying Current Core Based Statistical Area Standards to Historical Census Data, 1940-2020
January 2025
Working Paper Number:
CES-25-10
In the middle of the twentieth century, the Bureau of the Budget, in conjunction with the Census Bureau and other federal statistical agencies, introduced a widely used unit of statistical geography, the county-based Standard Metropolitan Area. Metropolitan definitions since then have been generally regarded as comparable, but methodological changes have resulted in comparability issues, particularly among the largest and most complex metro areas. With the 2000 census came an effort to simplify the rules for defining metro areas. This study attempts to gather all available historical geographic and commuting data to apply the current rules for defining metro areas to create comparable statistical geography covering the period from 1940 to 2020. The changes that accompanied the 2000 census also brought a new category, "Micropolitan Statistical Areas," which established a metro hierarchy. This research expands on this approach, using a more elaborate hierarchy based on the size of urban cores. The areas as delineated in this paper provide a consistent set of statistical geography that can be used in a wide variety of applications.
View Full
Paper PDF
-
Gradient Boosting to Address Statistical Problems Arising from Non-Linkage of Census Bureau Datasets
June 2024
Working Paper Number:
CES-24-27
This article introduces the twangRDC package, which contains functions to address non-linkage in US Census Bureau datasets. The Census Bureau's Person Identification Validation System facilitates data linkage by assigning unique person identifiers to federal, third party, decennial census, and survey data. Not all records in these datasets can be linked to the reference file and as such not all records will be assigned an identifier. This article is a tutorial for using the twangRDC to generate nonresponse weights to account for non-linkage of person records across US Census Bureau datasets.
View Full
Paper PDF
-
An In-Depth Examination of Requirements for Disclosure Risk Assessment
October 2023
Authors:
Ron Jarmin,
John M. Abowd,
Ian M. Schmutte,
Jerome P. Reiter,
Nathan Goldschlag,
Victoria A. Velkoff,
Michael B. Hawes,
Robert Ashmead,
Ryan Cumings-Menon,
Sallie Ann Keller,
Daniel Kifer,
Philip Leclerc,
Rolando A. RodrÃguez,
Pavel Zhuravlev
Working Paper Number:
CES-23-49
The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
View Full
Paper PDF
-
An Anatomy of U.S. Establishments' Trade Linkages in Global Value Chains
June 2025
Working Paper Number:
CES-25-44
Global value chains (GVC) are a pervasive feature of modern production, but they are hard to measure. Using confidential microdata from the U.S. Census Bureau, we develop novel measures of the linkages between U.S. manufacturing establishments' imports and exports. We find that for every dollar of exports, imported inputs represent 13 cents in 2002 and 20 cents by 2017. Examining GVC trade flows in a gravity framework, we find that these flows are higher within 'round-trip' (input and output market is the same) linkages, regional trade agreements, and multinational firm boundaries. The strong complementarities between input and output markets are muted by the proportionality assumptions embedded in global input-output tables. Finally, with an off-the-shelf model, we show the round-trip results can be obtained when firm-specific sourcing and exporting fixed costs are linked.
View Full
Paper PDF
-
Multi-Product Firms and Trade Liberalization
August 2009
Working Paper Number:
CES-09-21
This paper develops a general equilibrium model of international trade that features selection across firms, products and countries. Firms' export decisions depend on a combination of firm 'productivity' and firm-product-country 'consumer tastes', both of which are stochastic and unknown prior to the payment of a sunk cost of entry. Higher-productivity firms export a wider range of products to a larger set of countries than lower-productivity firms. Trade liberalization induces endogenous reallocations of resources that foster productivity growth both within and across firms. Empirically, we find key implications of the model to be consistent with U.S. trade data.
View Full
Paper PDF
-
A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. Census
August 2025
Authors:
Lars Vilhuber,
John M. Abowd,
Ethan Lewis,
Nathan Goldschlag,
Michael B. Hawes,
Robert Ashmead,
Daniel Kifer,
Philip Leclerc,
Rolando A. RodrÃguez,
Tamara Adams,
David Darais,
Sourya Dey,
Simson L. Garfinkel,
Scott Moore,
Ramy N. Tadros
Working Paper Number:
CES-25-57
For the last half-century, it has been a common and accepted practice for statistical agencies, including the United States Census Bureau, to adopt different strategies to protect the confidentiality of aggregate tabular data products from those used to protect the individual records contained in publicly released microdata products. This strategy was premised on the assumption that the aggregation used to generate tabular data products made the resulting statistics inherently less disclosive than the microdata from which they were tabulated. Consistent with this common assumption, the 2010 Census of Population and Housing in the U.S. used different disclosure limitation rules for its tabular and microdata publications. This paper demonstrates that, in the context of disclosure limitation for the 2010 Census, the assumption that tabular data are inherently less disclosive than their underlying microdata is fundamentally flawed. The 2010 Census published more than 150 billion aggregate statistics in 180 table sets. Most of these tables were published at the most detailed geographic level'individual census blocks, which can have populations as small as one person. Using only 34 of the published table sets, we reconstructed microdata records including five variables (census block, sex, age, race, and ethnicity) from the confidential 2010 Census person records. Using only published data, an attacker using our methods can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We further confirm, through reidentification studies, that an attacker can, within census blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with race and ethnicity different from the modal person on the census block) with 95% accuracy. Having shown the vulnerabilities inherent to the disclosure limitation methods used for the 2010 Census, we proceed to demonstrate that the more robust disclosure limitation framework used for the 2020 Census publications defends against attacks that are based on reconstruction. Finally, we show that available alternatives to the 2020 Census Disclosure Avoidance System would either fail to protect confidentiality, or would overly degrade the statistics' utility for the primary statutory use case: redrawing the boundaries of all of the nation's legislative and voting districts in compliance with the 1965 Voting Rights Act.
View Full
Paper PDF
-
Location, Location, Location: The 3L Approach to House Price Determination
May 2004
Working Paper Number:
CES-04-06
The immobility of houses means that their location affects their values. This explains the common belief that three things determine the price of a house: location, location, and location. We use this notion to develop the 3L Approach to house price determination. That is, prices are determined by the Metropolitan Statistical Area (MSA), town, and street where the house is located. This study creates a unique data set based on data from the American Housing Survey (AHS) consisting of small 'clusters' of housing units with information on their housing characteristics and resident characteristics that is merged with census tract-level attributes. We use this data to verify the 3L Approach: we find that all three levels of location are significant when estimating the house price hedonic equation. This indicates that individuals care about their local neighborhood, i.e. the general upkeep of their street and possibly their neighbors' characteristics (cluster variables), a broader area such as the school district and/or the town (tract variables) that account for school quality and crime rates, and the particular amenities found in their MSA.
View Full
Paper PDF
-
Finding Suburbia in the Census
June 2025
Working Paper Number:
CES-25-40
This study introduces a methodology that goes beyond the urban/rural dichotomy to classify areas into detailed settlement types: urban cores, suburbs, exurbs, outlying towns, and rural areas. Utilizing a database that provides housing unit estimates for census tracts as defined in 2010 for all decennial census years from 1940 to 2020, this research enables a longitudinal analysis of urban spatial expansion. By maintaining consistent geography across time, the methodology described in this paper emphasizes the era of development, as well as proximity to large urban centers. This broadly applicable methodology provides a framework for comparing the evolution of urban landscapes over a significant historical period, revealing trends in the transformation of territory from rural to urban, as well as associated suburbanization and exurban growth.
View Full
Paper PDF
-
Exports, Borders, Distance, and Plant Size
June 2010
Working Paper Number:
CES-10-13
The fact that large manufacturing plants export relatively more than small plants has been at the foundation of much work in the international trade literature. We examine this fact using Census micro data on plant shipments from the Commodity Flow Survey. We show the fact is not entirely an international trade phenomenon; part of it can be accounted for by the effect of distance, distinct from any border effect. Export destinations tend to be further than domestic destinations, and large plants tend to ship further distances even to domestic locations, as compared with small plants. We develop an extension of the Melitz (2003) model and use it to set up an analysis with model interpretations of ratios between large plant and small plant shipments that can be calculated with the data. We obtain a decomposition of the overall ratio into a term that varies with distance, holding fixed the border, and a term that varies with the border, holding fixed the distance. The distance term accounts for more than half of the overall difference.
View Full
Paper PDF
-
Networking Off Madison Avenue
October 2005
Working Paper Number:
CES-05-15
This paper examines the effect on productivity of having more near advertising agency neighbors and hence better opportunities for meetings and exchange within Manhattan. We will show that there is extremely rapid spatial decay in the benefits of having more near neighbors even in the close quarters of southern Manhattan, a finding that is new to the empirical literature and indicates our understanding of scale externalities is still very limited. The finding indicates that having a high density of commercial establishments is important in enhancing local productivity, an issue in Lucas and Rossi-Hansberg (2002), where within business district spatial decay of spillovers plays a key role. We will argue also that in Manhattan advertising agencies trade-off the higher rent costs of being in bigger clusters nearer 'centers of action', against the lower rent costs of operating on the 'fringes' away from high concentrations of other agencies. Introducing the idea of trade-offs immediately suggests heterogeneity is involved. We will show that higher quality agencies are the ones willing to pay more rent to locate in greater size clusters, specifically because they benefit more from networking. While all this is an exploration of neighborhood and networking externalities, the findings relate to the economic anatomy of large metro areas like New Yorkthe nature of their buzz.
View Full
Paper PDF