This concentration in data science is intended for COMPSCI majors interested in studying data science in depth, with a distinctively computational focus. If you are interested in data science but not necessarily in becoming a COMPSCI major, there are other options that are less concerned with the lower-level computational aspects:
- The IDM (interdepartmental major) in Stat+CS on Data Science covers more topics on statistical data analysis, while
- The IDM in Math+CS on Data Science focuses more on the mathematical foundations of data science.
Prerequisites
- One of the following introductory COMPSCI courses or equivalent:
- COMPSCI 101L (Introduction to Computer Science)
- COMPSCI 102 (Interdisciplinary Introduction to Computer Science)
- COMPSCI 116 (Foundations of Data Science)
- MATH 111L (Introductory Calculus I) or equivalent
- MATH 112L (Introductory Calculus II) or equivalent
- Probability: STA 230, STA 231, STA 240
Requirements
- COMPSCI 201 (Data Structures and Algorithms)
- COMPSCI 216 (Everything Data)
- COMPSCI 230 (Discrete Math for Computer Science) see substitutions
- COMPSCI 210D (Introduction to Computer Systems) or 250D (Computer Architecture)
- COMPSCI 316 (Introduction to Databases) or 516 (Database Systems)
- COMPSCI 330 (Design & Analysis of Algorithms)
- Two courses in MATH/STA:
- Linear Algebra: MATH 218 or MATH 221
- Statistics: STA 250*, STA 360**, STA 432, or MATH 342
- *ECE 480 is an approved substitution for STA 250 [NOTE: As of Fall 2020, STA 250 is no longer offered.]
- **You cannot use STA 360 as an elective if you are using it as the requirement here.
- One of the following courses:
- COMPSCI 370* (Intro. Artificial Intelligence)
- COMPSCI 371 (Elements of Machine Learning)
- COMPSCI 570 (Artificial Intelligence)
- COMPSCI 571 (Probabilistic Machine Learning)
- COMPSCI 671* (Machine Learning)
- NOTE: 370 was renumbered from 270 in Fall 2019, and 671 from 571 in Spring 2019.
- Three Electives at 200-level or higher. One out of the three electives must be a COMPSCI course.
- One elective (independent Study possible) in COMPSCI, MATH (must be QS), STA (must be QS), or a related area approved by the Director of Undergraduate Studies.
- Two additional courses must be drawn from either the above list (COMPSCI 370, 371, 570, 571, 671) or the list below.
- STA 325 (Machine Learning and Data Mining)
- STA 360 (Bayesian Inference) - You cannot count STA 360 as an elective if you are using it for the Stats requirement above
- COMPSCI 260 (Computational Genomics)
- COMPSCI 321/521 (Graph-Matrix Analysis)
- COMPSCI 333 (Algorithms in the Real World)
- COMPSCI 390 (Special Topics) on the following subjects (some may not be offered regularly):
- Computational Approaches to Language Processing (Spring 2023)
- COMPSCI 445/MATH 465 (Intro to High Dimensional Data Analysis)
- COMPSCI 474 (Data Science Competition)
- COMPSCI 527 (Computer Vision)
- COMPSCI 590 (Topics) on the following subjects (some may not be offered regularly):
- Reinforcement Learning
- Algorithmic Foundations of Data Science
- Focus on SARS-Cov-2 and COVID-19 cross-list CBB 590-01 - (Spring 2021)
- Causality and Fairness for Data Analysis (Spring 2023)
- Data Science Concepts and Applications (Spring 2023)
- Elements of Deep Learning (Spring 2023)