Privacy in a Mobile-Social World

CompSci 590.03, Fall 2013

Instructor: Ashwin Machanavajjhala
When: 1:25 PM - 2:40 PM Wednesdays and Fridays
Where
306, North Building
Office Hours: By appointment (send email to firstname@cs.duke.edu)


Announcements:

Overview: Terabytes of data about individuals are collected and analyzed daily through the use of mobile and social applications. Disseminating the data or results of analysis over this data to third parties (e.g., researchers or advertisers) has tremendous scientific and commercial value. However, such data dissemination can lead to potential identification of individuals, and disclosure of their sensitive information.

This project-oriented course focuses on the foundations of statistical privacy -- the problem of disclosing aggregate statistics about data collected from individuals while ensuring that individual level sensitive properties are not disclosed. The course will cover fundamentals of defining privacy and developing efficient algorithms for enforcing privacy, as well as the challenges in developing privacy preserving algorithms in real-world applications -- publishing social science data (like in the US Census), continuous user activity monitoring (like in search logs, location traces, energy monitoring), social networks, recommendation engines and targeted advertising.

Prerequisites: The course is open to interested graduate and undergraduate students with sufficient mathematical maturity. Basic knowledge in algorithms, proof techniques, and probability will be assumed. 

Administrivia:

  Syllabus:

#
Date
Topic
Readings
1
Wed, Aug 28
Introduction and Attacks on privacy (slides)
"Privacy", J. DeCew, The Stanford Encyclopedia of Philosophy
   (Fall 2012 Edition), Edward N. Zalta (ed.)
"A Face is exposed for AOL Searcher No. 4417749",
   New York Times, Aug 2006
"How Companies Learn your Secrets",
   New York Times, Feb 2012
2
Fri, Aug 30
De-anonymization (slides)
"Robust deanonymization of Sparse Datasets (Netflix Prize Data)",
   A. Narayanan, V. Shmatikov, IEEE S&P 2008
3
Wed, Sep 4
De-anonymization (contd.)
"Wherefore Art Thou R3579X?",
   L. Backstrom, C. Dwork, J. Kleinberg, WWW 2007


ANONYMOUS DATA PUBLISHING

4
Fri, Sep 6
k-Anonymity, L-Diversity, etc. (slides)
"Incognito: Efficient full domain k-anonymity",
   K. Lefevre, D. DeWitt, R. Ramakrishnan, SIGMOD  2005
"L-diversity: privacy beyond k-anonymity"
A. Machanavajjhala, J. Gehrke, D. Kifer, M. Venkitasubramaniam, ICDE 2006
Optional:
"T-closeness: privacy beyond k-anonymity and l-diversity"
N. Li, T. Li, S. Venkatasubramanian
5
Wed, Sep 11
Anonymity in Social Networks (slides)
"Resisting Structural Re-identification in Anonymized Social Networks"
M. Hay, G. Miklau, D. Jensen, D. Towsley, C. Li VLDBJ 2010
6
Fri, Sep 13
Simulatability (or "the adversary knows the privacy mechanism") (slides)
"Simulatable Auditing", K. Kenthapadi, N. Mishra, K. Nissim PODS 2005
"Minimality Attack in Privacy Preserving Data Publishing",
R. C. Wong, A. Fu, K. Wang, J. Pei, VLDB 2007


PROVABLE PRIVACY

7
Wed, Sep 18
Accuracy Limits on Private Query Answering (slides)
"Revealing information while preserving privacy"
I. Dinur, K. Nissin PODS 2003
Optional:
"Composition Attacks and Auxiliary Information in Data Privacy"
S. Ganta, S. Kasiviswanathan, A. Smith KDD 2008
8
Fri, Sep 20
Differential Privacy + Laplace Mechanisms (slides)
"Calibrating Noise to Sensitivity in Private Date Analysis"
C. Dwork, F. McSherry, K. Nissim, A.Smith TCC 2006
Optional: "Differential Privacy"
C. Dwork, ICALP 2006
9
Wed, Sep 25
Differential Privacy(contd.) (slides)


10
Fri, Sep 27
Differential Privacy: Exponential Mechanism (slides)
(Deadline for choosing project topics)


Wed, Oct 2
NO CLASS (work on project proposal)

11
Fri, Oct 4
Pufferfish Semantics and No Free Lunch Theorem (slides)

"Pufferfish: A Framework for Mathematical Privacy Definitions",
D. Kifer, A. Machanavajjhala TODS 2013 (sections 7,8,10 optional)
Optional:"No Free Lunch in Data Privacy",
D. Kifer, A. Machanavajjhala SIGMOD 2011
12
Wed, Oct 9
Smooth Sensitivity and Sample Aggregate Framework (slides)
"Smooth Sensitivity and Sampling in Private Data Analysis", K. Nissim, S. Raskhodnikova, A. Smith STOC 2007
Optional: "Privacy-preserving statistical estimation with optimal convergence rates", A. Smith STOC 2011
13
Fri Oct 11
Optimizing utility for differential privacy - constrained inference (slides)
(Deadline for submitting project proposal)
"Boosting the Accuracy of Differentially Private Histograms
Through Consistency
", M. Hay, V. Rastogi, G. Miklau, D. Suciu, PVLDB 2010
14
Wed, Oct 16 Optimizing utility for differential privacy - Query strategies (Hierarchical, Wavelet and Matrix Mechanisms) (slides)

"Differential Privacy via Wavelet Transforms",
X. Xiao, G. Wang, J. Gehrke ICDE 2009
Optional: "Optimizing Linear Counting Queries Under Differential Privacy"
C. Li, M. Hay, V. Rastogi, G. Miklau, A. McGregor PODS 2010
15
Fri, Oct 18 Sparse Vector Technique

"The Sparse Vector Technique" Lecture Notes, Aaron Roth
16
Wed, Oct 23
Data Dependent Query Optimization & Multiplicative Weights Algorithms (slides) (proofs)

"A simple and practical algorithm for differentially private data release", F. McSherry, K. Ligett, M. Hardt, NIPS 2012
Optional: "A Multiplicative Weights Mechanism for Privacy Preserving Data Analysis" M. Hardt, G. Rothblum, FOCS 2010
17
Fri, Oct 25 Implementing Differential Privacy Privately (slides) "Airavat: Security and Privacy for MapReduce", I. Roy, S. Setty, A. Kilzer, V. Shmatikov, E. Witchel
"Differential Privacy Under Fire". Haeberlen, Pierce, and Narayan, 2011
"Gupt: Privacy Preserving Data Analysis Made Easy", P. Mohan, A. Thakurtha, E. Shi, D. Song, D. Culler, SIGMOD 2012
18
Wed, Oct 30
Answering n2+o(1) Counting Queries with Differential Privacy is Hard
(Guest Lecture: Janardhan Kulkarni)
J. Ullman, "Answering n2+o(1) Counting Queries with Differential Privacy is Hard", STOC 13


APPLICATIONS
19
Fri, Nov 1
Publishing Synthetic Data Theory (notes from Aaron Roth) & Practice (slides)
"Privacy: Theory meets practice on the map", A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, L. Vilhuber, ICDE 2008
Blum, Ligett, Roth "A leraning theory approach to non-interactive database privacy" STOC '08
20
Wed, Nov 6
Privacy in Genomics & Machine Learning (slides) Functional Mechanism: Regression Analysis under Differential Privacy
J.  Zhang, Z. Zhang, X. Xiao, Y. Yang, M. Winslett, PVLDB 2012
Optional: "Differentially Private Empirical Risk Minimization", K. Chaudhuri, C. Monteleoni, A. Sarwate JMLR 2011
21
Fri, Nov 8
Publishing Search Logs   (slides)
(Deadline for midterm report)
"User 4xxxxx9: Anonymizing Query Logs", E. Adar, WWW 2007
"Publishing Search Logs", M. Gotz, A. Machanavajjhala, G. Wang, X. Xiao, J, Gehrke, TKDE 2012
22
Wed, Nov 13
Targeted Advertising (slides) "Adnostic: Privacy Preserving Targeted Advertising", V. Toubiana, A. Narayanan, D. Boneh, H. Nissenbaum, S. Barocas NDSS 2010
"Privacy Violations Using Microtargeted Ads", A. Korolova JPC 2011

Fri, Nov 15
NO CLASS (work on project)

23
Wed, Nov 20
Cloaking/Mixes for Location Privacy "Location Privacy in Pervasive Computing", A. Beresford, F. Stajano, 2003
"Preserving privacy in gps traces via uncertainty-aware path cloaking", B. Hoh, M. Gruteser, H. Xiong , A. Alrabby, ACM CCS 2007
24
Fri, Nov 22
Privacy and Economics
"Selling privacy at auction" A. Ghosh, A. Roth EC 2011
Optional: "Approximately Optimal Auctions for Selling Privacy when Costs are Correlated with Data", L. Fleischer, Y. Lyu EC 2012

Wed, Nov 27
NO CLASS - thanksgiving break


Fri, Nov 29
NO CLASS - thanksgiving break
25
Wed, Dec 4
Project Presentations (1-3PM)
(Deadline for final report)

26
Fri, Dec 6
Project Presentations (1-3PM)