skip to content

Tutorial 1: Data Linkage – Introduction, Recent Advances, and Privacy Issues

Presented by: 
Peter Christen
Tuesday 5th July 2016 - 09:00 to 10:30
INI Seminar Room 1
Tutorial outline:
The tutorial consists of four parts:
(1)  Data linkage introduction, short history of data linkage, example applications, and the data linkage process (overview of the main steps).
(2)  Detailed discussion of all steps of the data linkage process (data cleaning and standardisation, indexing/blocking, field and record comparisons, classification, and evaluation), and core techniques used in these steps.
(3)  Advanced data linkage techniques, including collective, group and graph linking techniques, as well as advanced indexing techniques that enable large-scale data linkage.
(4)  Major concepts, protocols and challenges in privacy-preserving data linkage, which aims to link databases across organisations without revealing any private or confidential information.  

Assumed knowledge: The aim is to make this tutorial as accessible as possible to a wide ranging audience from various backgrounds. The content will focus on concepts and techniques rather than details of algorithms. Basic understanding in databases, algorithms, and probabilities will be beneficial but not required. The tutorial will loosely be based on the book “Data Matching – Concepts and Techniques for Record Linkage, Entity Resolution and Duplicate Detection” (Springer, 2012) written by the presenter.
The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons