The Coin Problem with Applications to Data Streams
Consider the problem of computing the majority of a stream of n i.i.d. uniformly random bits. This problem, known as the coin problem, is central to a number of counting problems in different data stream models. We show that any streaming algorithm for solving this problem with large constant advantage (over the uniform distribution) must use Ω(log n) bits of space. Previously, it was known that computing the majority on every input with a constant probability takes Ω(log n) space. We extend our lower bound to proving tight lower bounds for solving multiple, randomly interleaved copies of the coin problem, as well as for solving the OR of multiple copies of a variant of the coin problem. Our proofs involve new measures of information complexity that are well-suited for data streams.
We use these lower bounds to obtain a number of new results for data streams. In each case there is an underlying d-dimensional vector x with additive updates to its coordinates given in a stream of length m. The input streams arising from our coin lower bound have nice distributional properties, and consequently for many problems for which we only had lower bounds in general turnstile streams, we now obtain the same lower bounds in more natural models, such as the bounded deletion model, in which ||x||_2 never drops by a constant fraction of what it was earlier, or in the random order model, in which the updates are ordered randomly.
Based on joint work with Mark Braverman and David P. Woodruff.
Sumegha Garg is a Rabin Postdoctoral Fellow at Harvard Theory of Computation group. She completed her PhD in Computer Science at Princeton University, advised by Mark Braverman. Before that, she received her bachelor’s degree in Computer Science and Engineering from Indian Institute of Technology, Delhi. She is broadly interested in Complexity Theory and Algorithmic Fairness, with a focus on space complexity and theory of fair predictions.