Design of Stable Algorithms for Privacy and Learning

CompSci 590.03, Fall 2016

Instructor: Ashwin Machanavajjhala
When: 1:25 PM - 2:40 PM Tuesdays and Thursdays
Where
LSRC A155
Office Hours: By appointment (send email to firstname@cs.duke.edu)


Announcements:

Synopsis: This course focuses on the design of differentially private algorithms and their applications to privacy and learning.

Our daily lives are actively being monitoring on web browsers, social networks, wearable devices and even robots. These data are routinely analyzed by statistical and machine learning tools to infer aggregate patterns of our behavior (with applications to science, medicine, advertising, etc). However, these data also contain private information about us that we would not like revealed to others (e.g., medical history, sexual orientation, locations visited, etc.). A grand challenge that pervades all fields of computer science (and more generally scientific) research is: how to learn from data collected from individuals while provably ensuring that private properties of individuals are not revealed by the results of the learning process.

A recent breakthrough is the formulation of differential privacy, a stability condition on algorithms: an algorithm is differentially private if its output is insensitive to (small) changes in the input. Differential private algorithms have found applications in developing algorithms with provable privacy guarantees while learning from data from varied domains (e.g., social science, medicine, communications) and in varied modalities (e.g., tables, graphs, streams), and is used by government organizations and internet corporations like Google and Apple to collect and analyze data. Differential privacy has also been shown to help design better learning algorithms.

We will study both the theory and practice of designing differentially private algorithms, and their applications to data arising from real world systems.

Prerequisites: The course is open to interested graduate and undergraduate students with sufficient mathematical maturity. Basic knowledge in algorithms, proof techniques, and probability will be assumed. Familiarity with databases and machine learning would help but is not necessary.

Format:

  Tentative Syllabus:
  • Intro to algorithm stability, privacy and learning (4 lectures)
  • Differential privacy (DP) and Basic building blocks (4 lectures)
  • Answering statistical counting queries under DP (4 lectures)
  • Data mining under DP (4 lectures)
  • DP and privacy (4 lectures)
  • DP learning (4 lectures)

#
Date
Topic
Readings


MODULE 1: Introduction

1
Tue, Aug 30

Overview on Privacy & Learning (slides)


"A Face is exposed for AOL Searcher No. 4417749",
   New York Times, Aug 2006
"How Companies Learn your Secrets",
   New York Times, Feb 2012
2
Thu, Sep 1
Intro to Differential Privacy (slides)
"Calibrating Noise to Sensitivity in Private Date Analysis"
C. Dwork, F. McSherry, K. Nissim, A.Smith TCC 2006
Optional: "Differential Privacy"
C. Dwork, ICALP 2006

Tue, Sep 6
NO CLASS


Thu, Sep 8
NO CLASS

3
Tue, Sep 13
What is (and is not) privacy? (slides)

4
Thu, Sep 15
Privacy and composition (slides)
"Revealing information while preserving privacy"
I. Dinur, K. Nissin PODS 2003
Optional:
"Composition Attacks and Auxiliary Information in Data Privacy"
S. Ganta, S. Kasiviswanathan, A. Smith KDD 2008
5
Tue, Sep 20
Differential Privacy and Privacy (slides)
"No Free Lunch in Data Privacy",
D. Kifer, A. Machanavajjhala SIGMOD 2011


MODULE 2: DP Algorithms

6
Thu, Sep 22
Building Blocks (slides)
"The algorithmic foundations of Differential Privacy", Roth, Dwork (sections 3.3, 3.4, 3.5)
7
Tue, Sep 27
Smooth Sensitivity and Sample Aggregate Framework (slides)

"Smooth Sensitivity and Sampling in Private Data Analysis", K. Nissim, S. Raskhodnikova, A. Smith STOC 2007
8
Thu , Sep 29
Answering sets of queries (slides)


(Deadline for choosing project topics)
"Boosting the Accuracy of Differentially Private Histograms
Through Consistency
", M. Hay, V. Rastogi, G. Miklau, D. Suciu, PVLDB 2010
"Differential Privacy via Wavelet Transforms",
X. Xiao, G. Wang, J. Gehrke ICDE 2009
9
Tue, Oct 4
Matrix Mechanism and Data dependent algorithms (slides)


10
Thu Oct 6
Multiplicative Weights and Differential Privacy (slides) "A simple and practical algorithm for differentially private data release", F. McSherry, K. Ligett, M. Hardt, NIPS 2012

Tue Oct 11
NO CLASS: FALL BREAK

11
Thu, Oct 13
Sparse Vector Technique and online query answering (slides)
(Deadline for submitting project proposal)
"The algorithmic foundations of Differential Privacy", Roth, Dwork (sections 3.6, 4.2)
Optional: "A Multiplicative Weights Mechanism for Privacy Preserving Data Analysis" M. Hardt, G. Rothblum, FOCS 2010
12
Tue Oct 18 Customizing Privacy to applications "Pufferfish: A Framework for Mathematical Privacy Definitions",
D. Kifer, A. Machanavajjhala TODS 2013 (sections 7,8,10 optional)


MODULE 3: Real world use cases and applications

13
Thu Oct 20
Privately collecting data from individuals
"RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response", Erlingsson, Pihur, Korolova, ACM CCS 2014
14
Tue Oct 25
Releasing Census Data
"Privacy: Theory meets practice on the map", A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, L. Vilhuber, ICDE 2008
"
Formal privacy protection for data products combining individual and employer frames" Haney et al
15
Thu, Oct 27
Location Privacy

16
Tue, Nov 1
Privacy in graphs

"Private Analysis of Graph Structure", Karwa et al VLDB 2011
"Publishing Graph Degree Distribution with Node Differential Privacy", Wei-Yen Day et al, SIGMOD 2016
17
Thu, Nov 3 Implementing Differential Privacy

"Differential Privacy Under Fire". Haeberlen, Pierce, and Narayan, 2011
"On the significance of least significant bits for differential privacy", Mironov CCS 2012


MODULE 4: DP & Learning


18
Tue, Nov 8
Model Inversion: Privacy attacks on Learning
(Deadline for submitting mid-term report)
"Privacy in Pharmacogenetics: An end-to-end use case study of personalized warfarin dosing", Fredrickson et al USENIX Security 2014
19
Thu  Nov 10 Classification and Regression under Differential Privacy "Differentially Private Empirical Risk Minimization", K. Chaudhuri, C. Monteleoni, A. Sarwate JMLR 2011
20
Tue, Nov 15
DP + Deep learning "Deep learning with Differential Privacy", Abadi et al
21
Thu,  Nov 17
What can you learn privately "What can we learn privately", Kasiviswanathan et al, FOCS 2008
22
Tue, Nov 22
Adaptive Analysis "Preserving Statistical Validity in Adaptive Data Analysis", Dwork et al, STOC 2015

Thu, Nov 24
NO CLASS: THANKSGIVING

23
Tue, Nov 29
Stability and generalization

Thu, Dec 1
FINAL PRESENTATIONS