skip to content

Timetable (DLAW04)

Privacy: recent developments at the interface between economics and computer science

Friday 28th October 2016

Friday 28th October 2016
09:00 to 09:50 Registration
09:50 to 10:00 Welcome from Christie Marr (INI Deputy Director) INI 1
10:00 to 11:00 Laura Brandimarte (University of Arizona)
How does Government surveillance affect perceived online privacy/security and online information disclosure?
Disclosure behaviors in the digital world are affected by perceived privacy and security just as much, or arguably more than they are by actual privacy/security features of the digital environment. Several Governments have recently been at the center of attention for secret surveillance programs that have affected the sense of privacy and security people experience online. In this talk, I will discuss evidence from two research projects showing how privacy concerns and disclosure behaviors are affected by perceived privacy/security intrusions associated with Government monitoring and surveillance. These two interdisciplinary projects bring together methodologies from different disciplines: information systems, machine learning, psychology, and economics.

The first project is in collaboration with the Census Bureau, and studies geo-location and its effects on willingness to disclose personal information. The U.S. Census Bureau has begun a transition from a paper-based questionnaire to an Internet-based one. Online data collection would not only allow for a more efficient gathering of information; it would also, through geo-location technologies, allow for the automated inference of the location from which the citizen is responding. Geo-location features in Census forms, however, may raise privacy concerns and even backfire, as they allow for the collection of a sensitive piece of information without explicit consent of the individual. Four online experiments investigate individuals’ reactions to geo-location by measuring willingness to disclose personal information as a function of geo-location awareness and the entity requesting information: research or Governmental institutions. The experiments also explicitly test how surveillance primes affect the relationship between geo-location awareness and disclosure. Consistent with theories of perceived risk, contextual integrity, and fairness in social exchanges, we find that awareness of geo-location increases privacy concerns and perceived sensitivity of requested information, thus decreasing willingness to disclose sensitive information, especially when participants did not have a prior expectation that the institution would collect that data. No significant interaction effects are found for a surveillance prime.

The second project is ongoing research about the “chilling effects” of Government surveillance on social media disclosures, or the tendency to self-censor in order to cope with mass monitoring systems raising privacy concerns. Until now, such effects have only been estimated using either Google/Bing search terms, Wikipedia articles, or survey data. In this research in progress, we propose a new method in order to test for chilling effects in online social media platforms. We use a unique, large dataset of Tweets and propose the use of new statistical machine learning techniques in order to detect anomalous trends in user behavior (use of predetermined, sensitive sets of keywords) after Snowden’s revelations made users aware of existing surveillance programs.
11:00 to 11:30 Morning Coffee
11:30 to 12:30 Ian Schumutte (University of Georgia)
Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods
Co-author: John M. Abowd (Cornell University and U.S. Census Bureau)

We consider the problem of the public release of statistical information about a population–explicitly accounting for the public-good properties of both data accuracy and privacy loss. We first consider the implications of adding the public-good component to recently published models of private data publication under differential privacy guarantees using a Vickery-Clark-Groves mechanism and a Lindahl mechanism. We show that data quality will be inefficiently under-supplied. Next, we develop a standard social planner’s problem using the technology set implied by (ε,δ) -differential privacy with (α,β) -accuracy for the Private Multiplicative Weights query release mechanism to study the properties of optimal provision of data accuracy and privacy loss when both are public goods. Using the production possibilities frontier implied by this technology, explicitly parameterized interdependent preferences, and the social welfare function, we disp lay properties of the solution to the social planner’s problem. Our results directly quantify the optimal choice of data accuracy and privacy loss as functions of the technology and preference parameters. Some of these properties can be quantified using population statistics on marginal preferences and correlations between income, data accuracy preferences, and privacy loss preferences that are available from survey data. Our results show that government data custodians should publish more accurate statistics with weaker privacy guarantees than would occur with purely private data publishing. Our statistical results using the General Social Survey and the Cornell National Social Survey indicate that the welfare losses from under-providing data accuracy while over-providing privacy protection can be substantial.

Related Links
12:30 to 13:30 Buffet Lunch @ INI
13:30 to 14:30 Alessandro Acquisti (Carnegie Mellon University)
The Economics of Privacy (remote presentation)
In the policy and scholarly debate over privacy, the protection of personal information is often set against the benefits society is expected to gain from large scale analytics applied to individual data. An implicit assumption underlays the contrast between privacy and 'big data': economic research is assumed to univocally predict that the increasing collection and analysis of personal data will be an economic win-win for data holders and data subjects alike - some sort of unalloyed public good. Using a recently published review of the economic literature on privacy, I will work from within traditional economic frameworks to investigate this notion. In so doing, I will highlight how results from economic research on data sharing and data protection actually paint a nuanced picture of the economic benefits and costs of privacy.
14:30 to 15:30 Katrina Ligett (Hebrew University of Jerusalem); (CALTECH (California Institute of Technology))
Buying Private Data without Verification
Joint work with Arpita Ghosh, Aaron Roth, and Grant Schoenebeck

We consider the  problem  of  designing  a  survey  to  aggregate  non-verifiable  information  from a privacy-sensitive population: an analyst wants to compute some aggregate statistic from the private bits held by each member of a population, but cannot verify the correctness of the bits reported by participants in his survey. Individuals in the population are strategic agents with a cost for privacy, i.e., they not only account for the payments they expect to receive from the mechanism, but also their privacy costs from any information revealed about them by the mechanism’s outcome—the computed statistic as well as the payments—to determine their utilities. How can the analyst design payments to obtain an accurate estimate of the population statistic when individuals strategically decide both whether to participate and whether to truthfully report their sensitive information?

In this talk, we will discuss an approach to this problem based on ideas from peer prediction and differential privacy.
15:30 to 16:00 Afternoon Tea
16:00 to 17:00 Mallesh Pai (Rice University); (University of Pennsylvania)
The Strange Case of Privacy in Equilibrium Models
Joint work with Rachel Cummings, Katrina Ligett and Aaron Roth

The literature on differential privacy by and large takes the data set being being analyzed as exogenously given. As a result, by varying a privacy parameter in his algorithm, the analyst straightforwardly chooses the potential privacy loss of any single entry in the data set.  Motivated by privacy concerns on the internet, we consider a stylized setting where the dataset is endogenously generated, depending on the privacy parameter chosen by the analyst. In our model, an agent chooses whether to purchase a product. This purchase decision is recorded, and a differentially private version of his purchase decision may be used by an advertiser to target the consumer. A change in the privacy parameter therefore affects, in equilibrium, the agents' purchase decision, the price of the product, and the targeting rule used by the advertiser. We demonstrate that the comparative statics with respect to privacy parameter may be exactly reversed relative to the exogenous data set benchmark, for example a higher privacy parameter may nevertheless be more informative etc.  More care is needed in understanding the effects of private analysis of a data set that is endogenously generated.
17:00 to 18:00 Wine Reception at INI
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons