Event Archive

Algorithms for Analyzing Spatio-temporal Data

Ph. D. Defense
Speaker Name
Abhinandan Nath
Location
LSRC D344
Date and Time
-

In today's age, huge data sets are becoming ubiquitous. In addition to their size, most of these data sets are often noisy, have outliers, and are incomplete. Hence, analyzing such data is challenging. We look at applying geometric techniques to tackle some of these challenges, with an emphasis on designing provably efficient algorithms.

People Tracking and Re-Identification from Multiple Cameras

Ph. D. Defense
Speaker Name
Ergys Ristani
Location
LSRC D344
Date and Time
-

In many surveillance or monitoring applications, one or more cameras view several people that move in an environment. Multi-person tracking amounts to using the videos from these cameras to determine who is where at all times. The problem is very challenging both computationally and conceptually. On one hand the amount of videos to process in enormous while near real-time performance is desired. On the other hand people's varying appearance due to lighting, occlusions, viewpoint changes, and unpredictable motion in blind spots make person re-identification challenging.

CANCELLED: Computational Social Choice Meets Databases

Data Seminar Series
Speaker Name
Benny Kimelfeld
Location
LSRC D344
Date and Time
-

I will describe a novel framework that aims to create bridges between the computational social choice and the database management communities. This framework enriches the tasks currently supported in computational social choice with a relational database context, thus making it possible to formulate sophisticated queries about voting rules, candidates, voters, issues, and positions. At the conceptual level, we give rigorous semantics to queries in this framework by introducing the notions of necessary answers and possible answers to queries.

Novel Approaches to DNA Computing

Ph. D. Defense
Speaker Name
Tianqi Song
Location
LSRC D344
Date and Time
-

This dissertation presents several novel architectures for DNA computing from different perspectives including analog DNA circuits, polymerase-based DNA logic circuits, and localized DNA-based biomolecular reaction networks on cancer cell membranes.

Policy Driven Data Sharing with Provable Privacy Guarantees

Ph. D. Defense
Speaker Name
Xi He
Location
North 311
Date and Time
-

Companies such as Google or Facebook collect a substantial amount of data about their users to provide useful services. The release of these datasets for general use can enable numerous innovative applications and research. However, such data contains sensitive information about users, and simple anonymization techniques have been shown to be ineffective to ensure users’ privacy. These privacy concerns have motivated many leading technology companies and researchers to develop algorithms that share data with provable privacy guarantees including differential privacy.

Eliciting and Aggregating Information for Better Decision Making

Ph. D. Defense
Speaker Name
Rupert Freeman
Location
LSRC D344
Date and Time
-

Algorithms play an increasing role in informing human decisions. We consider two classes of problems where this is the case. First, we discuss the design of algorithms for shared ownership. We extend the standard fair division framework to allow resources to be public, meaning that multiple agents can benefit from them simultaneously, and problems to be online, rather than one-shot. Second, we consider the problem of eliciting information for probabilistic forecasting.

End of Moore's Law Challenges and Opportunities: Computer Architecture Perspectives for the Post-ISA Era

Duke Computer Science/Electrical Computer Engineering Colloquium
Speaker Name
Margaret Martonosi
Location
Teer 106
Date and Time
-

For decades, Moore’s Law and its partner Dennard Scaling have driven technology trends that have enabled exponential performance improvements in computer systems at manageable power dissipation.  With the slowing of Moore/Dennard improvements, designers have turned to a range of approaches for extending scaling of computer systems performance and power efficiency.  Unfortunately, these scaling gains come at the expense of degraded hardware-software abstraction layers, increased complexity at the hardware-software interface, and increased challenges for software relia

Quantitative equational reasoning

Duke Computer Science Colloquium
Speaker Name
Prakash Panangaden
Location
North 311
Date and Time
-

Reasoning with equations is a central part of mathematics. Typically we think of solving equations but another role they play is to define algebraic structures like groups or vector spaces. Equational logic was formalized and developed by Birkhoff in the 1930s and led to a subject called universal algebra. Universal algebra was used in formalizing concepts of data types in computer science. In this talk I will present a quantitative analogue of equational logic: we write expressions like s =_ε t with the intended interpretation "s is within ε of t".

Data for Good: Data Science at Columbia University

Triangle Computer Science Distinguished Lecturer Series
Speaker Name
Jeannette Wing
Location
D106 LSRC, Duke (telecast from UNC)
Date and Time
-

Every field has data. We use data to discover new knowledge, to interpret the world, to make decisions, and even to predict the future. The recent convergence of big data, cloud computing, and novel machine learning algorithms and statistical methods is causing an explosive interest in data science and its applicability to all fields. This convergence has already enabled the automation of some tasks that better human performance. The novel capabilities we derive from data science will drive our cars, treat disease, and keep us safe.

Data-Intensive Systems for the Social Sciences

Duke Computer Science Colloquium
Speaker Name
Michael Cafarella
Location
LSRC D106
Date and Time
-

The social sciences are crucial for deciding billions in spending, and yet are often starved for data and badly underserved by modern computational tools. Building data-intensive systems for social science workloads holds the promise of enabling exciting discoveries in both computational and domain-specific fields, while also making an outsized real-world impact.

Building neural network models that can reason

Triangle Computer Science Distinguished Lecturer Series
Speaker Name
Christopher Manning
Location
D106 LSRC, Duke (telecast from UNC)
Date and Time
-

Deep learning has had enormous success on perceptual tasks but still struggles in providing a model for inference. To address this gap, we have been developing Memory-Attention-Composition networks (MACnets). The MACnet design provides a strong prior for explicitly iterative reasoning, enabling it to support explainable, structured learning, as well as good generalization from a modest amount of data. The model builds on the great success of existing recurrent cells such as LSTMs: A MacNet is a sequence of a single recurrent Memory, Attention, and Composition (MAC) cell.

Data Center Scheduling

Duke Computer Science Colloquium
Speaker Name
Samir Khuller
Location
LSRC D106
Date and Time
-

Data Centers have emerged as one of the dominant forms of cloud computing. However, there are several interesting (new) questions related to scheduling that arise. In this survey talk, we discuss several problems related to scheduling in data centers. The talk covers job scheduling problems related to utilization efficiency of VMs, along with questions dealing with basic communication issues that arise when multiple competing applications are running. We will also briefly discuss questions related to scheduling on multiple-data centers.

Improving Understanding and Exploration of Data by Non-Database Experts

Data Seminar Series
Speaker Name
Rachel Pottinger
Location
North 311
Date and Time
-

Users are faced with an increasing onslaught of data, whether it's in their choices of movies to watch, assimilating data from multiple sources, or finding information relevant to their lives on open data registries. In this talk I discuss some of the recent and ongoing work about how to improve understanding and exploration of such data, particularly by users with little database background.

If it ain't broke, don't fix it: Sparse metric repair

Duke Computer Science/Mathematics Colloquium
Speaker Name
Anna Gilbert
Location
LSRC D106
Date and Time
-

Many modern data-intensive computational problems either require, or benefit from distance or similarity data that adhere to a metric. The algorithms run faster or have better performance guarantees. Unfortunately, in real applications, the data are messy and values are noisy. The distances between the data points are far from satisfying a metric. Indeed, there are a number of different algorithms for finding the closest set of distances to the given ones that also satisfy a metric (sometimes with the extra condition of being Euclidean).

Image Imputation

Triangle Computer Science Distinguished Lecturer Series
Speaker Name
Polina Golland
Location
D106 LSRC, Duke (telecast from UNC)
Date and Time
-

We present an algorithm for creating high resolution anatomically plausible images that are consistent with acquired clinical brain MRI scans with large inter-slice spacing. Although large databases of clinical images contain a wealth of information, medical acquisition constraints result in sparse scans that miss much of the anatomy. These characteristics often render computational analysis impractical as standard processing algorithms tend to fail when applied to such images.

A Framework for Interactive Learning

Algorithms Seminar
Speaker Name
David Kempe
Location
North 311
Date and Time
-

In many settings, learning algorithms are deployed "in the wild", where their output is used before the classifier has been fully trained. In this online context, a natural model of interaction is Angluin's (1988) Equivalence Query model: the algorithm proposes an output classifier; the user responds with either the feedback that the classifier is correct, or otherwise provides the algorithm with a point that is mislabeled. The algorithm takes this feedback into account in producing a new, potentially very different, classifier, and the process repeats.

Find, Rank, Diversify Facts for Duke Man Basketball

Master's Defense
Speaker Name
Sitong Che
Location
LSRC D344
Date and Time
-

Basketball is always a hot topic in Duke. With the duke man basketball data, we aim to find the highlights and interesting facts of a game and let fans share them on the social network easily. Our ultimate goal of this project is to tweet the most interesting facts about a game automatically right after it finishes. In order to achieve that goal, we analyze the performances to find interesting claims efficiently, rank them by impressiveness, and ensure our final tweets' diversity.

Neighboring Vehicle Behavior Prediction Using A Gated Recurrent Unit Neural Network

Master's Defense
Speaker Name
Yesenia Velasco
Location
LSRC D344
Date and Time
-

Research in vehicular networks has intensified due to their promise of reducing vehicle collisions, providing driver assistance, and increasing fuel economy. A main component of vehicular networks is Vehicle-to-Vehicle (V2V) communication, allowing vehicles to share their status information by frequently broadcasting their state such as current speed or brake pressure. Sharing status information will allow other drivers to become aware of their location and immediate intent, but not necessarily their intended route or future behavior.

Effects of t-SNE and Diffusion Kernel Embedding on High-dimensional Data

Master's Defense
Speaker Name
Zuoming Dai
Location
LSRC D344
Date and Time
-

Dimensional reduction is indispensable to clustering and categorization of high-dimensional data. The t-Distributed Stochastic Neighbor Embedding (t-SNE) focuses on preserving the stochastic neighbor relationship, while diffusion kernel embedding aims to preserve the local and global geometric information. Both algorithms are followed by classification of data in dimension-reduced spaces.