Papers written by Author(s): 'John Cuffe'
The following papers contain search terms that you selected. From the papers listed below, you can navigate to the PDF, the profile page for that working paper, or see all the working papers written by an author. You can also explore tags, keywords, and authors that occur frequently within these papers.
See Working Papers by Tag(s), Keywords(s), Author(s), or Search Text
Click here to search again
Frequently Occurring Concepts within this Search
The model is able to label words and phrases by part-of-speech, including "organizations." By filtering for frequent words and phrases labeled as "organizations", papers are identified to contain references to specific institutions, datasets, and other organizations.
No tags occur more than twice in this search.
No authors occur more than twice in this search.
Viewing papers 1 through 1 of 1
-
Working PaperSqueezing More Out of Your Data: Business Record Linkage with Python
November 2018
Working Paper Number:
CES-18-46
Integrating data from different sources has become a fundamental component of modern data analytics. Record linkage methods represent an important class of tools for accomplishing such integration. In the absence of common disambiguated identifiers, researchers often must resort to ''fuzzy" matching, which allows imprecision in the characteristics used to identify common entities across dfferent datasets. While the record linkage literature has identified numerous individually useful fuzzy matching techniques, there is little consensus on a way to integrate those techniques within a single framework. To this end, we introduce the Multiple Algorithm Matching for Better Analytics (MAMBA), an easy-to-use, flexible, scalable, and transparent software platform for business record linkage applications using Census microdata. MAMBA leverages multiple string comparators to assess the similarity of records using a machine learning algorithm to disambiguate matches. This software represents a transparent tool for researchers seeking to link external business data to the Census Business Register files.View Full Paper PDF