Overlook: Differentially Private Exploratory Visualization for Big Data

Pratiksha Taker
Talk will be virtual on Zoom

Data exploration systems that provide differential privacy must manage a privacy budget that measures the amount of privacy lost across multiple queries. One effective strategy to manage the privacy budget is to compute a one-time private synopsis of the data, to which users can make an unlimited number of queries. However, existing systems using synopses are built for offline use cases, where a set of queries is known ahead of time and the system carefully optimizes a synopsis for it. The synopses that these systems build are costly to compute and may also be costly to store.

We introduce Overlook, a system that enables private data exploration at interactive latencies for both data analysts and data curators. The key idea in Overlook is a virtual synopsis that can be evaluated incrementally, without extra space storage or expensive precomputation. Overlook simply executes queries using an existing engine, such as a SQL DBMS, and adds noise to their results. Because Overlook's synopses do not require costly precomputation or storage, data curators can also use Overlook to explore the impact of privacy parameters interactively. Overlook offers a rich visual query interface based on the open source Hillview system. Overlook achieves accuracy comparable to existing synopsis-based systems, while offering better performance and removing the need for extra storage.

Pratiksha is a Ph.D. student in Stanford computer science, advised by Matei Zaharia.

She is interested in applying statistical and algorithmic techniques to build usable, secure, and adaptive systems. She has most recently built a system for fast, differentially-private data exploration, and a system that makes distributed shared memory efficient and user-friendly.

She graduated from MIT with a B.S. in computer science and mathematics (2014) and an M.Eng. in computer science (2015). At MIT, she worked on projects in computer networking, cognitive science, and natural language processing, and also thought a bit about diversity.

