Summer Undergraduate Projects in Computer Science

Interested in summer research? Like participating in team-based projects? Want to stay in Durham? Enjoy computer science and want to explore in more depth?

If you answered yes to any of the above questions, you should apply to be a part of our full-time ten-week summer experience. The program is designed for Computer Science majors at all levels, but we welcome applications from students early in their careers.

Students join small project teams (3-5 undergraduates per team), collaborating alongside other teams in a communal environment in the Computer Science department. Each team will have a graduate student project mentor and a faculty advisor. The projects (see below) come from a number of different areas spanning Computer Science. In addition to regular meetings with your mentor and advisor, students will participate in activities designed to support your introduction to research often with students from the Data+ and Code+ programs.

Participants will receive a $5,000 stipend, out of which they must arrange their own housing and travel. Funding and infrastructure support are provided by the Dept. of Computer Science in partnership with the Office of Information Technology and the Information Initiative at Duke. Participants may not accept employment or take classes during the program; this requirement is strictly enforced and non-negotiable.

The program runs from May 28th until August 2nd, 2019.

Application materials to be submitted on the application form:

  1. Resume
  2. (Unofficial) transcript
  3. One paragraph, per project choice, about your interest in the project (up to 3 projects).
  4. Contact information for two references. (No letters required!)

Application deadline: February 25, 2019. Applications should be submitted via the shared application with the Code+ program.

Apply Now

Project Offerings for Summer 2019:

Kamesh Munagala
Automated Agenda Management

In collaboration with graduate students, political scientists, and computer scientists at Duke and Stanford, a team of students will build a video chat platform for groups of people to converse on a topic of civic interest, say immigration policy or electoral reform at a national level, or more narrowly defined topics at a local or corporate level. The ultimate goal is to deploy the platform on a large scale to perform civic discourse and help with societal decision making. The main challenge with scaling such a platform is moderating the conversation and making sure it follows an agenda.

The prototype of the platform builds on Twilio, and dynamically generates transcripts of the conversation using the Google speech to text API. Students will build software to use these transcripts to perform dynamic moderation, specifically including AI tools to detect if the conversation is following the agenda. The project is fast-moving, and students may continue to use the transcripts to perform other moderation tasks. Students will collaborate with a team of graduate students at Duke and Stanford to think through the machine learning tools, Javascript, and API integration needed, and how to implement them.

Kamesh Munagala

Jun Yang
Scaling Up Live "Pop-Up" Fact Checking

Our society is struggling with an unprecedented amount of falsehoods, hyperboles, and half- truths that do harm to democracy, health, economy, and national security. Fact-checking is a vital defense against this onslaught, perhaps now more than ever. Despite the rise of fact- checking efforts globally, fact-checkers find themselves increasingly overwhelmed and their messages difficult to reach some segments of the public. Our overall project seeks to leverage the power of data and computing to help make fact-checking and dissemination of fact-checks to the public more effective, scalable, and sustainable.

In particular, this summer team will support scalable “pop-up” fact-checking, which in real-time identifies checked claims while media streams, social network feeds, and website contents are being consumed. The team will tackle the challenges that arise in deploying the system for live events and scaling it up to a large number of concurrent users. Some specific ideas include audio fingerprinting to identify live events being watched, aggregating live usage data to identify new check-worthy claims, and a subscription service to notify users as soon as a previously encountered claim is checked.

Jun Yang

Kristin Stephens-Martinez and Jeff Forbes
Breadcrumbs: Analyzing Classroom Data

Most classes use one or more tools (Gradescope, My Digital Hand, Piazza, Sakai, etc.) to keep the course running smoothly. When students use these tools, they create data at many levels (breadcrumbs) that could be collected and mined for insights on how to make the class better.

Students will be involved in formulating questions, analyzing data, and drawing conclusions. Some specific questions we currently seek to answer using this data include: How do students use office hours? Are there clear trends that can be predicted and thus change how UTAs are allocated to reduce office hour wait time? How do students seek help on all the available support resources? Does some support help students more than others? Could we use those insights to create a recommender system for students, so they get better help more quickly? What are the predictors of success in the class? Could those predictors be influenced, so more students are successful in the class?

Kristin Stephens-MartinezJeff Forbes

Cynthia Rudin
Duke Human/ML Decision Making

In this project, students develop interpretable machine learning tools that can help with human tasks spanning everything from decision making to art. They will study multiple tasks in interpretable machine learning using methods that can incorporate domain-based constraints and other types of domain knowledge using efficient discrete optimization techniques and Bayesian hierarchical modeling. Students will create open source software derived from theoretical insight into specific domain problems.

The tasks for this summer are to: 

  1. Create computer generated poetry
  2. Denoise low resolution images
  3. Provide interpretable image recognition

In the first task, students will attempt to incorporate semantic meaning into computer generated poetry. In the second tasks, students will explore how to upsample a low resolution image while maintaining a high signal to noise ratio. In the third task, students will use neural nets to dissect images into prototypical parts for interpretable image classification, for applications in medicine and beyond.

Cynthia Rudin

Ashwin Machanavajjhala
Protecting Individual Privacy using Differential Privacy

Data scientists in a number of fields including medicine, internet of things, and social science, routinely gather and analyze individual-level data. These data span every aspect of our lives and, thus, could breach our privacy by revealing medical diagnoses, sexual orientation, race and other sensitive properties about us. Differential privacy is a principled approach for data analysis with provable guarantees of privacy for individuals.

In this summer project, student teams will build usable interfaces to sensitive datasets through which users can query and analyze the sensitive data (medical and location trajectories) while satisfying differential privacy using state of the art differentially private algorithms.