skip to content

Space Embedding of Records for Privacy Preserving Linkage

Presented by: 
Vassilios Verykios
Wednesday 14th September 2016 - 10:00 to 10:30
INI Seminar Room 1
Massive amounts of data, collected by a wide variety of organizations, need to be integrated and matched in order to facilitate data analyses that may be highly beneficial to businesses, governments, and academia. Record linkage, also known as entity resolution, is the process of identifying records that refer to the same real-world entity from disparate data sets. Privacy Preserving Record Linkage (PPRL) techniques are employed to perform the linkage process in a secure manner, when the data that need to be matched are sensitive. In PPRL, input records undergo an anonymization process that embeds the records into a space, where the underlying data can be matched but not understood by naked eye.

The PPRL problem is picking up a lot of steam lately due to a ubiquitous need for cross matching of records that usually lack common unique identifiers and their field values contain variations, errors, misspellings, and typos. The PPRL process as it is applied to massive ammounts of data comprises of an anonymization phase, a searching phase and a matching phase.

Several searching and anonymization approaches have been developed with the aim to scale the PPRL process to big data without sacrificing quality of the results. Recently, redundant randomized methods have been proposed, which insert each record into multiple independent blocks in order to amplify the probability of bringing together similar records for comparison. The key feature of these methods is the formal guarantees, they provide, in terms of accuracy in the generated results.

In this talk, we present both state-of-the-art private searching methods and anonynimization techniques, by exposing their characteristics, including their strengths and weaknesses, and we also present a comparative evaluation.
The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons