Event Archive

Truthful Aggregation of Budget Proposals

CS-ECON Seminar Series
Speaker Name
Rupert Freeman
Location
LSRC D344
Date and Time
-

We consider a participatory budgeting problem in which each voter submits a proposal for how to divide a single divisible resource (such as money or time) among several possible alternatives (such as public projects or activities) and these proposals must be aggregated into a single consensus division.

Understanding Events in Natural Language: Learning, Common Sense, Annotation, and What's Next

Duke Computer Science/Electrical Computer Engineering Colloquium
Speaker Name
Qiang Ning
Location
LSRC D106
Date and Time
-

The era of information explosion has opened up an unprecedented opportunity to study the social, political, financial and medical events described in natural language text. While the past decades have seen significant progress in deep learning and natural language processing (NLP), it is still extremely difficult to analyze textual data at the event-level, e.g., to understand what is going on, what is the cause and impact, and how things will unfold over time.

Perturbation Analysis of Database Queries

Ph. D. Defense
Speaker Name
Brett Walenz
Location
LSRC D344
Date and Time
-

Data-driven decision making plays a dominant role across all domains, from health, business, government, to sports. These data-driven decisions are often ad-hoc and resource-intensive: a bank has to compare and analyze all users, sporting events might use previous events to estimate an acceptable ticket sales rate. In this dissertation, I describe efficient methods for optimizing complex analytic queries.

Some Mathematical and Computational Challenges Arising in Structural Molecular Biology

Special Talk
Speaker Name
Bruce Donald
Location
Phy 119 or Phy 101
Date and Time
-

Computational protein design is a transformative field with exciting prospects for advancing both basic science and translational medical research. New algorithms blend discrete and continuous mathematics to address the challenges of creating designer proteins. I will discuss recent progress in this area and some interesting open problems.

Evolution or Revolution in Software Development

Duke Computer Science Colloquium
Speaker Name
Andy Palay
Location
LSRC D106
Date and Time
-

When do you allow for the evolution of existing software versus throwing it all away and starting over? This is a question that all software developers will face many times during their career, in both small and in large ways. Too often there is a strong urge to start over with a clean slate, getting rid of all the cruft that has built up over the years.  Unfortunately the decision to succumb to this urge is taken with limited guidance, or clear reasoning.

Systems Research -- Construed Broadly

Triangle Computer Science Distinguished Lecturer Series
Speaker Name
Margo Seltzer
Location
LSRC D106
Date and Time
-

Once upon a time, Computer Systems was a broad field encompassing everything from hardware to software. The incredible growth and success that our field has experienced over the past half a century has had the side effect of transforming systems into a constellation of siloed fields. I'm going to make the case that we should return to a broad interpretation of systems, undertake bolder, higher risk projects, and be intentional about how we interact with other fields. I'll support the case with examples of several research projects that embody this approach.

Computer Science Undergraduate Project Showcase 2019

Special Event
Location
LSRC Hall of Science (B wing), Duke University
Date and Time
-

The Project Showcase is held annually by the Computer Science Department to highlight independent or team-based research and project work done during the academic year. The best projects in each category will be decided by faculty judges and announced towards the end of the event. Come explore the impressive efforts of our students and their advisors!

Multi-Dimensional Robust Synthetic Control: Exploring Counterfactuals and Predicting Cricket Scores

Triangle Computer Science Distinguished Lecturer Series
Speaker Name
Devavrat Shah
Location
LSRC D106
Date and Time
-

The “what ifs?” or ability to explore counterfactuals is central to the study of causal inference. Randomized control and A/B testing provides an approach to address this when counterfactuals can be experimented simultaneously. However, in a large number of scenarios such as policy evaluation, this is not feasible: we can’t have two Massachusetts, one having Gun Control and the other not at the same time, so that we can evaluate the impact of Gun Control on crime rate!

Integrating MNase-seq and RNA-seq Time Series Data to Study Dynamic Chromatin and Transcriptional Regulation Under Cadmium Stress

Master's Defense
Speaker Name
Trung Tran
Location
LSRC D344
Date and Time
-

Though the sequence of the genome is essentially fixed, within each cell it exists in a complex and changing state, determined in part by the dynamic binding of proteins. These proteins—including nucleosomes, transcription factors (TFs), polymerases, and other complexes—define the living chromatin state of the genome. Understanding genome-wide how the dynamics of chromatin interact with the dynamics of transcriptional regulation remains a fundamental research problem.

Improving Distributed Transactional Storage Performance through Remote Direct Memory Access

Master's Defense
Speaker Name
Richard (Siyu) Chen
Location
LSRC D309
Date and Time
-

The rapid development of Cloud computing and enterprise IT demand modern data centers to be less expensive and more efficient. With the improvement in software design such as concurrency control, caching and lease, we also need to exploit emerging network connection technologies to reduce some overhead. Remote Direct Memory Access (RDMA), a networking technology originally used in High-Performance Computing (HPC), is a trending technology for inter-node connection.

NanoMine Data Analysis

Master's Defense
Speaker Name
Zhao Chen
Location
North 311
Date and Time
-

This project attempted to uncover hidden patterns in nanocomposite data. Natural Language Processing technique was used to analyze material science papers in order to discover the relationships between different material science terminologies. Also, this project tried to find the relationship between the glass-transition temperature (Tg) and other available features and the relationship between the shape parameter and other variables such as matrix type. Different models were used in this project such as LASSO regression, Support Vector Machine with Gaussian kernel and Decision Tree.

Is Multi-task Learning (MTL) Always Helpful to Improve the Original Model’s Performance?

Master's Defense
Speaker Name
Cheng Chen
Location
North 311
Date and Time
-

Recently convolutional neural networks (CNNs) have become popular for image recognition tasks due to their excellent performance compared to other earlier approaches. One limitation of CNNs however is that they require substantial quantities of hand-labeled training imagery compared to other models before they achieve their performance advantage. In this circumstance, multi-task learning (MTL) has been proposed, in which a single CNN is trained to perform several recognition tasks simultaneously.

Parallelizing Factlet Mining from Duke Basketball Game Statistics using Apache Spark

Master's Defense
Speaker Name
Wenqian Tong
Location
LSRC D344
Date and Time
-

Given statistics about a basketball game, we would like to generate interesting factlets about players' performances, e.g., "in NCAA tournament game on March 29, Rowan Barrett became the first player to have at least 11 assists in a game against Virginia Tech in Duke history." Such factlets are often used in media reporting and for fan engagement. Time is of essence for this application, yet finding all such claims is a time-consuming task. In this project, we use Apache Spark on Google Cloud to parallelize the analysis so it can be completed in a speedy and economical manner.

Optimization of Factlet Mining from Duke Basketball Game Statistics

Master's Defense
Speaker Name
Bruce (Qian) Wang
Location
LSRC D344
Date and Time
-

In this project, we optimize a system for automatically mining interesting "factlets" from basketball game statistics. After each Duke Men's basketball game, we examine individual players' performance in the context of all historical data and generate noteworthy statements such as "in the ACC tournament game vs.

Student Paths in CS1: Case Studies of Initial Poor Performers

Master's Defense
Speaker Name
Ji Yeon Kim
Location
LSRC D344
Date and Time
-

With the high influx of computer science enrollment in universities in the last decade, there is increasing value and wide-reaching effects in improving pedagogy in the field. This improvement is especially useful in introductory computer science courses (CS1). Student experience in the first programming course is known to heavily influence students' desires to stay in the field.

Neural Knowledge Representation and Reasoning

Duke Computer Science Colloquium
Speaker Name
Patrick Verga
Location
LSRC D106
Date and Time
-

Making complex decisions in areas like science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate data sources. While some decisions can be made from a single source of information, others require considering multiple pieces of evidence and how they relate to one another.

Interpretable Almost-Matching Exactly with Instrumental Variables

Master's Defense
Speaker Name
Yameng Liu
Location
LSRC D344
Date and Time
-

We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. The method proposed in this work aims to match units on a weighted Hamming distance, taking into account the relative importance of the covariates; To match units on as many relevant variables as possible, the algorithm creates a hierarchy of covariate combinations on which to match (similar to downward closure), in the process solving an optimization problem for each unit in order to construct the optimal matches.

Providing Secure Internet Services with Insecure Infrastructure

Duke Computer Science/Electrical Computer Engineering Colloquium
Speaker Name
Yixin Sun
Location
Fitzpatrick Center Schiciano Auditorium Side B
Date and Time
-

The insecurity of Internet services can lead to disastrous consequences – confidential communications can be monitored, financial information can be stolen, and our critical Internet infrastructure can be crippled. However, many prior works on Internet services only focus on the security of an individual network layer in isolation, whereas the adversaries do quite the opposite – they look for opportunities to exploit the interactions across heterogeneous components and layers to compromise the system security.

Discrete Optimization Meets Machine Learning

Duke Computer Science Colloquium
Speaker Name
Elias Khalil
Location
LSRC D106
Date and Time
-

Discrete Optimization algorithms underlie intelligent decision-making in a wide variety of domains. From airline fleet scheduling to kidney exchanges and data center resource management, decisions are often modeled with binary on/off variables that are subject to operational and financial constraints.