IDM in Statistics+CS on Data Science

The Departments of Statistical Science and Computer Science have collaboratively mapped out a data science pathway for an interdepartmental major (IDM) between the two departments. This pathway makes it easier for you to identify courses relevant to a career in data science, and to plan and optimize your program of study accordingly.

Note that this IDM is intended for students interested in data science, particularly its underpinning statistical techniques, but not necessarily its lower-level computational aspects. Depending on your interests, the other options include:

  • The Data Science Concentration within the COMPSCI BS major, which requires fewer courses on the mathematical and statistical foundations, but focuses more on the computational aspect and practical issues that arise in applying data science.
  • The IDM in MATH+CS on Data Science, which covers more topics on its mathematical foundations.

Note also that some STAT and COMPSCI courses required below need Calculus, Multivariable Calculus, Linear Algebra, and Introduction to Computer Science as prerequisites. More specifically:

  • Introduction to Computer Science: one of COMPSCI 101, 102, 116, or their AP or IB or pre-college equivalents
  • Calculus: MATH 111L and MATH 112L, or their AP or IB or pre-college equivalents
  • Multivariable Calculus: one of MATH 202, 212, or 222, taken at Duke or transferred
  • Linear Algebra: one of MATH 216, 218, or 221, taken at Duke or transferred

From Computer Science:

  • COMPSCI 201 - Data Structures and Algorithms
  • One of COMPSCI 316 - Introduction to Databases or COMPSCI 516 - Data-Intensive Systems
    • NOTE: We are in the process of changing this requirement to CompSci 210 or COMPSCI 250, as COMPSCI 210 or 250 is now a prerequisite for COMPSCI 316. COMPSCI 316 will now be one of the elective choices. We will allow COMPSCI 316 or COMPSCI 210 or COMPSCI 250 for this requirement for anyone who matriculated before Fall 2022. 
  • COMPSCI 330 - Design and Analysis of Algorithms
  • One of COMPSCI 371 - Elements of Machine Learning, COMPSCI 370 - Intro. Artificial Intelligence, COMPSCI 570 - Artificial Intelligence, or COMPSCI 671 - Machine Learning
    • NOTE: COMPSCI 370 was re-numbered from 270 in Fall 2019, and COMPSCI 671 from 571 in Spring 2019.
    • NOTE: COMPSCI 571 (not listed here) is cross-listed as STA 561, and can be used as an elective for the requirement by statistics.
  • 3 Electives from the following (or others approved by the Director of Undergraduate Studies):
    • COMPSCI 216 - Everything Data
    • COMPSCI 230 - Discrete Math for CS or 232 - Discrete Mathematics and Proofs
    • COMPSCI 210 - Intro to Computer Systems or COMPSCI 250 - Computer Architecture
    • COMPSCI 226 - User Research Methods in Human-Centered Computing
    • COMPSCI 260 - Computational Genomics
    • COMPSCI 316 - Introduction to Databases
      • NOTE: COMPSCI 316 can be an elective if you take COMPSCI 210 or 250 in place of the COMPSCI 316 requirement above.
    • COMPSCI 321/521 - Graph-Matrix Analysis
    • COMPSCI 333 - Algorithms in the Real World - previously a COMPSCI 290
    • COMPSCI 474 - Data Science Competition
    • COMPSCI 526 - Data Science
    • COMPSCI 527 - Computer Vision
    • COMPSCI 290/590 (Topics) on the following subjects (some may not be offered regularly):
      • Algorithmic Aspects of Machine Learning 
      • Algorithms for Big Data 
      • Algorithmic Foundations of Data Science 
      • Reinforcement Learning

From Statistics:

  • STA 199 - Intro to Data Science
  • STA 210 - Regression
  • STA 230 - Probability
  • STA 250 - Mathematical Statistics or STA 432 - Stat Learning and Inference
  • STA 360 - Bayesian Modeling
  • 2 Electives from the following  (or others approved by the Director of Undergraduate Studies):
    • STA 310 - Generalized Linear Models
    • STA 313 - Advanced Data Visualization
    • STA 323 - Statistical Computing
    • STA 325 - Machine Learning and Data mining
    • STA 440 - Capstone
    • STA 444 - Spatio-Temporal Modeling
    • STA 450 - Social Network Analysis
    • STA 465 - High Dimensional Data Analysis
    • STA 561 - Machine Learning