- With widespread use of machine learning, there have been serious societal consequences from using black box models for high-stakes decisions, including flawed bail and parole decisions in criminal justice. Explanations for black box models are not reliable, and can be misleading.
Temporal data is ubiquitous in our everyday life, but tends to be noisy and often exhibits transient patterns. To make better decisions with data, we must avoid jumping to conclusions based on certain particular query results or observations. Instead, a useful perspective is to consider "durability'', or, intuitively speaking, finding results that are robust and stand "the test of time''. This thesis studies durability queries on temporal data that return durable results efficiently and effectively.
The taxation principle guarantees that any single-agent mechanism can be interpreted as a menu-based mechanism (one that offers each buyer a menu: a mapping from their payment to their (expected) allocation) without loss of generality. What loss in welfare does one incur in designing multiagent menu-based mechanisms in prior-free environments? We study this question for general single-dimensional valuations and give tight characterizations of the loss in welfare as a function of natural parameters governing the evolution of value profile across rounds.
Canceled Putting Ethical AI to the Vote
I will present the "virtual democracy" framework for the design of ethical AI. In a nutshell, the framework consists of three steps:
This thesis addresses the problem of all-to-all near-neighbor (all-NN) search among a large dataset of discrete points in a high-dimensional feature space endowed with a distance metric. Near-neighbor search is fundamental to numerous computational tasks arising in modern data analysis, modeling, and knowledge discovery. The accuracy and efficiency of near-neighbor search depends on the underlying structure of the dataset.
Poor accuracy has always been a challenge to neurosurgery procedures. The conventional method for procedures like External ventricular drain (EVD) highly relies on the personal experience and familiarity with neuroanatomical features of the surgeon. Without guidance, this method often yields poor results because of the anatomical variations in individual patients. The purpose of this project is to explore the feasibility of enhancing operating accuracy of neurosurgery procedures like EVD insertions with external tracking solutions and a wearable augmented-reality holographic device.
As robot autonomy increases, new challenges occur in human-robot interaction (HRI) studies. Investigations on HRI involving different levels of autonomy (LOA) benefit both the operator and system performance. In order to provide a high level of semi-autonomous control in related lab experiments, this project develops a human-robot system featuring waypoint control. The system consists of an RGB-D camera, a robotic arm, and a Graphical User Interface (GUI).
Distributed denial of service (DDoS) attack has grown to a major network security thread due to the recent prosperity of DDoS-for-hire services and IoT botnets. Despite of the various efficient and deployable commercial and academic network layer DDoS protection architecture, the thread remains active since the focus of DDoS attack has switched from network bandwidth to application layer resources, which can hardly be defended under current defense solutions. In this paper, we present Adjusted deficit round robin (ADRR) algorithm as a solution to application layer DDoS attack.
We investigate how to generate the most meaningful "factlets" from Duke University Basketball game statistics.
This project considers the problem of automatically converting questions and claims in natural language to potentially complex SQL queries over a domain-specific database. By "complex," we mean that the queries are analytical in nature, often involving aggregation and subqueries, as opposed to focusing on retrieval, which traditional natural language querying interfaces are designed for.
Transcriptional regulatory networks, composed of regulatory proteins and their target genes, control many aspects of cell development and physiology. One important class of regulatory proteins are transcription factors (TFs), which bind to DNA in a sequence specific manner to regulate gene expression. In the human genome, TFs often bind in clusters, i.e. two or more DNA binding sites in close proximity to each other. However, the binding of TFs to clusters of sites is not well understood.
In this project we study the problem of extracting semantic patterns from sentences to assist with several text analysis tasks, such as identifying checkworthy claims from text and parsing such claims into representations amenable to automatic fact-checking. We leverage NLP tools to parse each sentence into a tree presentation, and replace each specific token with appropriate, more general labels useful for identifying their semantic roles. Substructures in this parse tree serve as our patterns of interest.
Link to talk video: https://compsci.capture.duke.edu/Panopto/Pages/Viewer.aspx?id=f8b8abac-e471-4b79-826d-ab9001221efe
Advances in monitoring, tracing, and profiling large, complex datacenters produce rich datasets and establish a rigorous foundation for understanding datacenter performance. But the sheer volume and complexity of the data challenges existing techniques, which rely heavily on expert knowledge, human intervention, and simple statistics to gain performance insights.
Inspired by various applications including ad auctions, matching markets, and voting, mechanism design deals with the problem of designing algorithms that take inputs from strategic agents and return an outcome optimizing a given objective.
Entering the information age, the demands for online services increase dramatically. Such high demands are pushing the network systems to become more complex and making system availability a crucial requirement for both service providers and clients. The service providers are aiming to have an effective, efficient and stable service: the service should be failure resilient, be scalable to support a large group of clients and still keep acceptable performance. Whereas the clients need a “powerful” service – high performance without threats to their privacy or security.