Presented by:
Grigorios Loukides
Date:
Thursday 29th September 2016 - 15:30 to 16:30
Venue:
INI Seminar Room 2
Abstract:
Organizations disseminate sequential data to support
applications in domains ranging from marketing to healthcare. Such data are
typically modeled as a collection of sequences, or a series of time-stamped
events, and they are mined by data recipients aiming to discover actionable
knowledge. However, the mining of sequential data may expose sensitive patterns
that leak confidential knowledge, or lead to intrusive inferences about groups
of individuals.
In this talk, I will review the problem and present two
approaches that prevent it, while retaining the usefulness of data in mining
tasks.
The first approach is applicable to a collection of
sequences and sanitizes sensitive patterns by permuting their events. The
selected permutations avoid changes in the set of frequent non-sensitive
patterns and in the ordering information of the sequences. The second approach
is applicable to a series of time-stamped events and sanitizes sensitive events
by deleting them from carefully selected time points. The deletion of events is
guided by a model that captures changes to the probability distribution of
events across the sequence.
The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.