Department of Statistics and Data Science

Rebecca Nugent, Department Head

Peter Freeman, Director of Undergraduate Studies

Samantha Nielsen, Associate Director of Academic Programs

Glenn Clune, Academic Program Manager

Amanda Mitchell, Academic Program Manager

Sylvie Aubin, Undergraduate Academic Advisor

Email: statadvising@andrew.cmu.edu
Location: Baker Hall 129
www.stat.cmu.edu/

Overview

Uncertainty is inescapable: randomness, measurement error, deception, and incomplete or missing information all complicate our lives. Statistics is the science and art of making predictions and decisions in the face of uncertainty. Statistical issues are central to big questions in public policy, law, medicine, industry, computing, technology, finance, and science. Indeed, the tools of statistics apply to problems in almost every area of human activity where data are collected.

Statisticians have diverse skills in computing, mathematics, decision making, designing experiments, forecasting, and interpreting and communicating analysis results. Moreover, effective statisticians actively collaborate with people in other fields and, in the process, learn about other fields. Statistics & Data Science students who master core concepts and collaboration are highly sought after in the marketplace.

Recent statistics majors at Carnegie Mellon have taken jobs at leading companies in many fields, including the National Economic Research Association, Boeing, Morgan Stanley, Deloitte, Rosetta Marketing Group, Nielsen, Proctor & Gamble, Accenture, and Goldman Sachs.  Others have taken research positions at the National Security Agency, the U.S. Census Bureau, and the Science and Technology Policy Institute, or worked for Teach for America.  Many of our students also go on to graduate study at some of the top programs in the country including Carnegie Mellon, Harvard, MIT, Yale, NYU, Penn, Johns Hopkins, Duke, Michigan, Chicago, Northwestern, Washington, Stanford, and California.

The Department and Faculty

The Department of Statistics & Data Science at Carnegie Mellon University is world-renowned for its contributions to statistical theory and practice. Research in the department spans the gamut from pure mathematics to the hottest frontiers of science. Current research projects are helping make fundamental advances in neuroscience, cosmology, public policy, finance, and genetics.

The faculty members are recognized around the world for their expertise and have garnered many prestigious awards and honors. (For example, three members of the faculty have been awarded the COPSS medal, the highest honor given by professional statistical societies.) At the same time, the faculty is firmly dedicated to undergraduate education. The entire faculty, junior and senior, teach courses at all levels. The faculty are accessible and are committed to involving undergraduates in research.

The Department augments all these strengths with a friendly, energetic working environment and exceptional computing resources. Talented graduate students join the department from around the world, and add a unique dimension to the department's intellectual life. Faculty, graduate students, and undergraduates interact regularly.
 

How to Take Part

There are many ways to get involved in statistics at Carnegie Mellon:

  • The Bachelor of Science in Statistics in the Dietrich College of Humanities and Social Sciences (DC) is a broad-based, flexible program that helps you master both the theory and practice of statistics. The program can be tailored to prepare you for later graduate study in statistics or to complement your interests in almost any field, including psychology, physics, biology, history, business, information systems, and computer science.
  • The Minor (or Additional Major) in Statistics is a useful complement to a (primary) major in another department or college. Almost every field of inquiry must grapple with statistical problems, and the tools of statistical theory and data analysis you will develop in the Statistics minor (or Additional Major) will give you a critical edge.
  • The Bachelor of Science in Economics and Statistics provides an interdisciplinary course of study aimed at students with a strong interest in the empirical analysis of economic data. Jointly administered by the Department of Statistics & Data Science and the Undergraduate Economics Program, the major's curriculum provides students with a solid foundation in the theories and methods of both fields. (See Dietrich College Interdepartmental Majors as well later in this section)
  • The Bachelor of Science in Statistics and Machine Learning is a program housed in the Department of Statistics & Data Science and is jointly administered with the Department of Machine Learning. In this major students take courses focused on skills in computing, mathematics, statistical theory, and the interpretation and display of complex data. The program is geared toward students interested in statistical computation, data science, and "big data" problems.
  • The Statistics Concentration and the Operations Research and Statistics Concentration in the Mathematical Sciences Major (see Department of Mathematical Sciences) are administered by the Department of Mathematical Sciences with input from the Department of Statistics & Data Science.
  • Non-majors are eligible to take most of our courses, and indeed, they are required to do so by many programs on campus. Such courses offer one way to learn more about the Department of Statistics & Data Science and the field in general.

Curriculum

Statistics consists of two intertwined threads of inquiry: statistical theory and data analysis. The former uses probability theory to build and analyze mathematical models of data in order to devise methods for making effective predictions and decisions in the face of uncertainty. The latter involves techniques for extracting insights from complicated data, designs for accurate measurement and comparison, and methods for checking the validity of theoretical assumptions. Statistical theory informs data analysis and vice versa. The Department of Statistics & Data Science curriculum follows both of these threads and helps students develop required skills.

Throughout the sections of this catalog, we describe the requirements for the Major in Statistics (the core major as well as the Mathematics and Neuroscience tracks), followed by the requirements for the Major in Economics and Statistics, the Major in Statistics and Machine Learning, and the Minor in Statistics.

Note: We recommend that you use the information provided below as a general guideline, and then schedule a meeting with a Statistics Undergraduate Advisor (statadvising@stat.cmu.edu) to discuss the requirements in more detail, and build a program that is tailored to your strengths and interests.

B.S. in Statistics

Glenn Clune, Academic Program Manager
Location: Baker Hall 129
statadvising@andrew.cmu.edu

Students in the Bachelor of Science program develop and master a wide array of skills in computing, mathematics, statistical theory, and the interpretation and display of complex data. In addition, Statistics majors gain experience in applying statistical tools to real problems in other fields and learn the nuances of interdisciplinary collaboration. The requirements for the B.S. in Statistics are detailed below and are organized by categories #1-7.

Curriculum

1. Mathematical Foundations (Prerequisites)29–42 units

Mathematics is the language in which statistical models are described and analyzed, so some experience with basic calculus and linear algebra is an important component for anyone pursuing a program of study in Statistics & Data Science.

Calculus*

Complete one of the two following sequences of mathematics courses at Carnegie Mellon, each of which provides sufficient preparation in calculus:

Sequence 1
21-111Calculus I10
21-112Calculus II10
And one of the following three courses:
21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11
Sequence 2
21-120Differential and Integral Calculus10
And one of the following three courses:
21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

Notes:

  • Passing the MSC 21-120 assessment test is an acceptable alternative to completing 21-120.

Linear Algebra**

Complete one of the following three courses:

21-240Matrix Algebra with Applications10
21-241Matrices and Linear Transformations11
21-242Matrix Theory11

* It is recommended that students complete the calculus requirement during their freshman year.

**The linear algebra requirement needs to be completed before taking 36-401 Modern Regression

21-241 and 21-242 are intended only for students with a very strong mathematical background.

2. Data Analysis36-45 units

Data analysis is the art and science of extracting insight from data. The art lies in knowing which displays or techniques will reveal the most interesting features of a complicated data set. The science lies in understanding the various techniques and the assumptions on which they rely. Both aspects require practice to master.

The Beginning Data Analysis courses give a hands-on introduction to the art and science of data analysis. The courses cover similar topics but differ slightly in the examples they emphasize. 36-200 draws examples from many fields and satisfy the DC College Core Requirement in Statistical Reasoning. This course is therefore recommended for students in the college. (Note: a score of 5 on the Advanced Placement [AP] Exam in Statistics may be used to waive this requirement). 36-220 emphasizes examples in engineering.

The Intermediate Data Analysis courses build on the principles and methods covered in the introductory course, and more fully explore specific types of data analysis methods in more depth.

The Advanced Data Analysis courses draw on students' previous experience with data analysis and understanding of statistical theory to develop advanced, more sophisticated methods. These core courses involve extensive analysis of real data with emphasis on developing the oral and writing skills needed for communicating results.

Sequence 1 (For students beginning their freshman or sophomore year)
Beginning*

Choose one of the following courses:

36-200Reasoning with Data *9
36-220Engineering Statistics and Quality Control9

*A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement. 36-220 emphasizes examples in engineering and Architecture.

Note: Students who enter the program with credit for 36-235 or 36-236 should discuss options with an advisor.  

Intermediate*

Choose one of the following courses:

36-202Methods for Statistics & Data Science **9
36-309Experimental Design for Behavioral & Social Sciences9
36-290Introduction to Statistical Research Methodology9
* Or an extra Advanced Data Analysis Elective
** Must take prior to 36-401, if not, an additional Advanced Data Analysis Elective is required
Advanced Data Analysis Elective

Choose one of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

Students can also take a second 36-46x (see section #5).

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9
Sequence 2 (For students beginning later in their college career)
Advanced Data Analysis Electives

Choose two of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

**All Special Topics are not offered every semester, and new Special Topics are regularly added. See section 5 for details.

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9

3. Probability Theory and Statistical Theory18 units

The theory of probability gives a mathematical description of the randomness inherent in our observations. It is the language in which statistical models are stated, so an understanding of probability is essential for the study of statistical theory. Statistical theory provides a mathematical framework for making inferences about unknown quantities from data. The theory reduces statistical problems to their essential ingredients to help devise and evaluate inferential procedures. It provides a powerful and wide-ranging set of tools for dealing with uncertainty.

To satisfy the theory requirement take the following two courses:

Take one of the following courses:
36-235Probability and Statistical Inference I *9
36-225Introduction to Probability Theory9
And one of the following three courses:
36-236Probability and Statistical Inference II **9
36-226Introduction to Statistical Inference9
36-326Mathematical Statistics (Honors)9


*It is possible to substitute 36-21836-21936-225, or 21-325 for 36-23536-235 is the standard (and recommended) introduction to probability, 36-219 is tailored for engineers and computer scientists, 36-218 is a more mathematically rigorous class for Computer Science students and more mathematically advanced (students need advisor approval to enroll), and 21-325 is a rigorous probability theory course offered by the Department of Mathematics.

**It is possible to substitute 36-226 or 36-326 (honors course) for 36-23636-236 is the standard (and recommended) introduction to statistical inference.

Please note that students who complete 36-235 are expected to take 36-236 to complete their theory requirements. Students who choose to take 36-225 instead will be required to take 36-226 afterward, they will not be eligible to take 36-236.

Comment:

(i) In order to meet the prerequisite requirements, a grade of at least a C is required in 36-235 (or equivalent), 36-236  (or equivalent), and 36-401.

4. Statistical Computing19 to 21 units

Fundamental to the practice of statistics and data science is the ability to effectively code data processing and analysis tasks. Within the domain of statistics, the use of the programming language R is ubiquitous, and thus we expose students to it throughout the curriculum (and in depth in Statistical Computing). Within the larger domain of data science, the use of the programming language Python is also ubiquitous, and thus we require all majors to gain, at a minimum, basic competency in the language by taking either Principles of Computing, or Fundamentals of Programming and Computer Science. We would advise those students who are considering receiving course credit for one of these two courses given their score on the AP Computer Science A exam to actually take one (or both) of them at Carnegie Mellon instead, as within data science as a whole Python is far more widely used than Java.

Take one of the following two courses:
15-110Principles of Computing10
15-112Fundamentals of Programming and Computer Science12
Complete the following course:
36-350Statistical Computing9

5. Special Topics9 units

The Department of Statistics & Data Science offers advanced courses that focus on specific statistical applications or advanced statistical methods. These courses are numbered 36-46x (36-461, 36-462, etc.). The objective of the course is to expose students to important topics in statistics and/or interesting applications which are not part of the standard undergraduate curriculum. Note that all Special Topics are not offered every semester, and new Special Topics are regularly added.

To satisfy the Special Topics requirement choose one of the 36-46x courses (which are 9 units).

Note: All 36-46x courses require 36-401 as a prerequisite or corequisite.

6. Statistical Elective9–12 units

Students are required to take one elective which can be within or outside the Department of Statistics & Data Science. Courses within Statistics & Data Science can be any 300 or 400 level course (that is not used to satisfy any other requirement for the statistics major). 

The following is a partial list of courses outside Statistics & Data Science that qualify as electives as they provide the intellectual infrastructure that will advance the student's understanding of statistics and its applications. Other courses may qualify as well; consult with the Statistics Undergraduate Advisor.

15-121Introduction to Data Structures10
15-122Principles of Imperative Computation12
10-301Introduction to Machine Learning (Undergrad)12
10-315Introduction to Machine Learning (SCS Majors)12
15-388Practical Data Science9
21-127Concepts of Mathematics12
21-260Differential Equations9
21-292Operations Research I9
21-301Combinatorics9
21-355Principles of Real Analysis I9
80-220Philosophy of Science9
80-221Philosophy of Social Science9
80-310Formal Logic9
85-310Research Methods in Cognitive Psychology9
85-320Research Methods in Developmental Psychology9
85-340Research Methods in Social Psychology9
88-223Decision Analysis12
88-302Behavioral Decision Making9

Note: Additional prerequisites are required for some of these courses. Students should carefully check the course descriptions to determine if additional prerequisites are necessary.

7. Concentration Area

Self-Defined Concentration Area (with advisor's approval)36 UNITS

The power of statistics, and much of the fun, is that it can be applied to answer such a wide variety of questions in so many different fields. A critical part of statistical practice is understanding the questions being asked so that appropriate methods of analysis can be used. Hence, a critical part of statistical training is to gain experience applying abstract tools to real problems.

The Concentration Area is a set of four related courses outside of Statistics & Data Science that prepares the student to deal with statistical aspects of problems that arise in another field. These courses are usually drawn from a single discipline of interest to the student and must be approved by the Statistics Undergraduate Advisor. While these courses are not in Statistics & Data Science, the concentration area must complement the overall degree.

For example, students intending to pursue careers in the health or biomedical sciences could take further courses in biology or chemistry, or students intending to pursue graduate work in statistics could take further courses in advanced mathematics.

The concentration area can be fulfilled with a minor or additional major, but not all minors and additional majors fulfill this requirement. Please make sure to consult the Undergraduate Statistics Advisor prior to pursuing courses for the concentration area. Once the concentration area is approved, any changes made to the previously agreed upon coursework require re-approval by the Undergraduate Advisor.

Concentration Approval Process

  • Submit the below materials to the Undergraduate Statistics Advisor
    • List of possible coursework to fulfill the concentration*
    • 150-200 word essay describing how the proposed courses complement the B.S. in Statistics degree.

* These courses can be amended later but must be re-approved by the Statistics Undergraduate Advisor if amended.

* Note: The concentration/track requirement is only for students whose primary major is statistics and has no other additional major or minor. The requirement does not apply for students who pursue an additional major in statistics.

Total number of units for the major156-183* Units
Total number of units for the degree360 Units

*Note: This number can vary depending on the courses chosen for the concentration area that a student takes. Speak with an academic advisor for more details.

Recommendations

Students in the Dietrich College of Humanities and Social Sciences who wish to major or minor in Statistics are advised to complete both the calculus requirement (one Mathematical Foundations calculus sequence) and the Beginning Data Analysis course 36-200 by the end of their freshman year.

The linear algebra requirement is a prerequisite for the course 36-401. It is therefore essential that students complete this requirement by their junior years at the latest. 

Recommendations for Prospective Ph.D. Students

Students interested in pursuing a Ph.D. in statistics or biostatistics (or related programs) after completing their undergraduate degree are strongly recommended to pursue the Mathematical Statistics Track or to take additional Mathematics courses. Although 21-240 Matrix Algebra with Applications is recommended for Statistics majors, students interested in PhD programs should consider taking 21-241 Matrices and Linear Transformations or 21-242 Matrix Theory instead. Additional courses to consider are 21-228 Discrete Mathematics21-341 Linear Algebra21-355 Principles of Real Analysis I, and 21-356 Principles of Real Analysis II.

Additional Major in Statistics

Students who elect the B.S. in Statistics as a second or third major must fulfill all Statistics degree requirements except for the Concentration Area requirement.  Majors in many other programs would naturally complement a statistics major, including Tepper's undergraduate business program, Social and Decision Sciences, Policy and Management, and Psychology.

With respect to double-counting courses, it is departmental policy that students must have at least five statistics courses that do not count for their primary major. If students do not have at least five, they will need to take additional advanced data analysis electives.

Students are advised to begin planning their curriculum (with appropriate advisors) as soon as possible. This is particularly true if the other major has a complex set of requirements and prerequisites or when many of the other major's requirements overlap with the requirements for the B.S. in Statistics.

Substitutions and Waivers

Many departments require Statistics & Data Science courses as part of their Major or Minor programs. Students seeking transfer credit for those requirements from substitute courses (at Carnegie Mellon or elsewhere) should seek permission from their advisor in the department setting the requirement. The final authority in such decisions rests there. The Department of Statistics & Data Science does not provide approval or permission for substitution or waiver of another department's requirements.

If a waiver or substitution is made in the home department, it is not automatically approved in the Department of Statistics & Data Science. In many of these cases, the student will need to take additional courses to satisfy major requirements. Students should discuss this with a Statistics advisor when deciding whether to add an additional major in Statistics.

Research

The Statistics & Data Science program encourages students to gain research experience. Opportunities within the department include Summer Undergraduate Research Apprenticeships (SURA), run in association with the university's Office of Undergraduate Research and Scholar Development, and the departmental capstone courses 36-490 Undergraduate Research36-493 Sports Analytics Capstone, or 36-497 Corporate Capstone Project. (Note that these courses require an application.) Additionally, students can pursue independent study. For those students who maintain a quality point average of 3.25 overall or above, there is also the Dietrich College Senior Honors Program.

The faculty in the Statistics & Data Science department largely work within the domains of statistical theory and methodological development, areas that require advanced mathematical training. Thus we encourage students to search broadly for research opportunities: faculty, post-doctoral researchers, and graduate students in many departments throughout the university have data to analyze and would welcome the help of undergraduate statistics students.

Sample Programs

The following sample programs illustrate three (of many) ways to satisfy the requirements for the B.S. in Statistics. However, keep in mind that the program is flexible enough to support many other possible schedules and to emphasize a wide variety of interests.

The first schedule uses calculus sequence 1.

The second schedule is an example of the case when a student enters the program through 36-235 and 36-236 (and therefore skips the beginning data analysis sequence). This schedule has more emphasis on statistical theory and probability.

Schedule 1

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data36-202 Methods for Statistics & Data Science36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
21-111 Calculus I21-112 Calculus II21-256 Multivariate Analysis36-350 Statistical Computing
-----One of the following two courses:Course toward concentration21-240 Matrix Algebra with Applications
----- 15-110 Principles of Computing----------
15-112 Fundamentals of Programming and Computer Science

Third-YearFourth-Year
FallSpringFallSpring
36-401 Modern Regression36-402 Advanced Methods for Data AnalysisCourse toward concentrationCourse toward concentration
36-3xx or 36-4xx Advanced Data Analysis Elective36-46x Special Topics course----- -----
Course toward concentrationCourse toward concentration----- -----
----- ---------- -----

Schedule 2

First-YearSecond-Year
FallSpringFallSpring
21-120 Differential and Integral Calculus21-256 Multivariate Analysis36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
36-200 Reasoning with DataOne of the following two courses:----- 21-240 Matrix Algebra with Applications
----- 15-110 Principles of Computing----- -----
----- 15-112 Fundamentals of Programming and Computer Science----- -----

Third-YearFourth-Year
FallSpringFallSpring
36-350 Statistical Computing36-402 Advanced Methods for Data Analysis36-46x Special Topics Course toward concentration
36-401 Modern RegressionCourse toward concentrationCourse toward concentration36-3xx or 36-4xx Advanced Data Analysis Elective
36-3xx or 36-4xx Advanced Data Analysis Elective----- ----- -----
Course toward concentration---------- -----

B.S. in Statistics (Mathematical Sciences Track)

Glenn Clune, Academic Program Manager
Location: Baker Hall 129
statadvising@andrew.cmu.edu

Students in the Bachelor of Science program develop and master a wide array of skills in computing, mathematics, statistical theory, and the interpretation and display of complex data. In addition, Statistics majors gain experience in applying statistical tools to real problems in other fields and learn the nuances of interdisciplinary collaboration. The requirements for the B.S. in Statistics (Mathematical Sciences Track) are detailed below and are organized by categories #1-#7.

Curriculum

1. Mathematical Foundations (Prerequisites)29–42 units

Mathematics is the language in which statistical models are described and analyzed, so some experience with basic calculus and linear algebra is an important component for anyone pursuing a program of study in Statistics & Data Science.

Calculus*

Complete one of the two following sequences of mathematics courses at Carnegie Mellon, each of which provides sufficient preparation in calculus:

Sequence 1
21-111Calculus I10
21-112Calculus II10
And one of the following three courses:
21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11
Sequence 2
21-120Differential and Integral Calculus10
And one of the following three courses:
21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

Notes:

  • Passing the MSC 21-120 assessment test is an acceptable alternative to completing 21-120.

Linear Algebra**

Complete one of the following three courses:

21-240Matrix Algebra with Applications10
21-241Matrices and Linear Transformations11
21-242Matrix Theory11

* It is recommended that students complete the calculus requirement during their freshman year.

**The linear algebra requirement needs to be completed before taking 36-401 Modern Regression

21-241 and 21-242 are intended only for students with a very strong mathematical background.

2. Data Analysis36-45 units

Data analysis is the art and science of extracting insight from data. The art lies in knowing which displays or techniques will reveal the most interesting features of a complicated data set. The science lies in understanding the various techniques and the assumptions on which they rely. Both aspects require practice to master.

The Beginning Data Analysis courses give a hands-on introduction to the art and science of data analysis. The courses cover similar topics but differ slightly in the examples they emphasize. 36-200 draws examples from many fields and satisfy the DC College Core Requirement in Statistical Reasoning. This course is therefore recommended for students in the college. (Note: a score of 5 on the Advanced Placement [AP] Exam in Statistics may be used to waive this requirement). 36-220 emphasizes examples in engineering.

The Intermediate Data Analysis courses build on the principles and methods covered in the introductory course and more fully explore specific types of data analysis methods in more depth.

The Advanced Data Analysis courses draw on students' previous experience with data analysis and understanding of statistical theory to develop advanced, more sophisticated methods. These core courses involve extensive analysis of real data with emphasis on developing the oral and writing skills needed for communicating results.

Sequence 1 (For students beginning their freshman or sophomore year)
Beginning*

Choose one of the following courses:

36-200Reasoning with Data *9
36-220Engineering Statistics and Quality Control9

*A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement. 36-220 emphasizes examples in engineering and Architecture.

Note: Students who enter the program with 36-235 or 36-236 should discuss options with an advisor.  

Intermediate*

Choose one of the following courses:

36-202Methods for Statistics & Data Science **9
36-309Experimental Design for Behavioral & Social Sciences9
36-290Introduction to Statistical Research Methodology9
*Or an extra Advanced Data Analysis Elective
** Must take prior to 36-401, if not, an additional Advanced Data Analysis Elective is required
Advanced Data Analysis Elective

Choose one of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

Students can also take a second 36-46x (see section #5).

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9
Sequence 2 (For students beginning later in their college career)
Advanced

Choose two of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

**All Special Topics are not offered every semester, and new Special Topics are regularly added. See section 5 for details.

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9

3. Probability Theory and Statistical Theory18 units

The theory of probability gives a mathematical description of the randomness inherent in our observations. It is the language in which statistical models are stated, so an understanding of probability is essential for the study of statistical theory. Statistical theory provides a mathematical framework for making inferences about unknown quantities from data. The theory reduces statistical problems to their essential ingredients to help devise and evaluate inferential procedures. It provides a powerful and wide-ranging set of tools for dealing with uncertainty.

To satisfy the theory requirement take the following two courses:

Take one of the following courses:
36-235Probability and Statistical Inference I *9
36-225Introduction to Probability Theory9
And one of the following three courses:
36-226Introduction to Statistical Inference9
36-236Probability and Statistical Inference II **9
36-326Mathematical Statistics (Honors)9


*It is possible to substitute 36-218, 36-219, 36-225,or 21-325 for 36-23536-235 is the standard (and recommended) introduction to probability, 36-219 is tailored for engineers and computer scientists, 36-218 is a more mathematically rigorous class for Computer Science students and more mathematically advanced (students need prior approval to enroll), and 21-325 is a rigorous probability theory course offered by the Department of Mathematics.

**It is possible to substitute 36-226 or 36-326 (honors course) for 36-23636-236 is the standard (and recommended) introduction to statistical inference. 

Please note that students who complete 36-235 are expected to take 36-236 to complete their theory requirements. Students who choose to take 36-225 will be required to take 36-226 afterward, they will not be eligible to take 36-236.

Comment:

(i) In order to meet the prerequisite requirements, a grade of at least a C is required in 36-235 (or equivalent), 36-236  (or equivalent), and 36-401.

4. Statistical Computing19 to 21 units

Fundamental to the practice of statistics and data science is the ability to effectively code data processing and analysis tasks. Within the domain of statistics, the use of the programming language R is ubiquitous, and thus we expose students to it throughout the curriculum (and in depth in Statistical Computing). Within the larger domain of data science, the use of the programming language Python is also ubiquitous, and thus we require all majors to gain, at a minimum, basic competency in the language by taking either Principles of Computing, or Fundamentals of Programming and Computer Science. We would advise those students who are considering receiving course credit for one of these two courses given their score on the AP Computer Science A exam to actually take one (or both) of them at Carnegie Mellon instead, as within data science as a whole Python is far more widely used than Java. 

Take one of the following two courses:
15-110Principles of Computing10
15-112Fundamentals of Programming and Computer Science12
Complete the following course:
36-350Statistical Computing9

5. Special Topics9 units

The Department of Statistics & Data Science offers advanced courses that focus on specific statistical applications or advanced statistical methods. These courses are numbered 36-46x (36-461, 36-462, etc.). The objective of the course is to expose students to important topics in statistics and/or interesting applications which are not part of the standard undergraduate curriculum. Note that all Special Topics are not offered every semester, and new Special Topics are regularly added.

To satisfy the Special Topics requirement choose one of the 36-46x courses (which are 9 units).

Note: All 36-46x courses require 36-401 as a prerequisite or corequisite.

6. Statistical Elective9–12 units

Students are required to take one elective which can be within or outside the Department of Statistics & Data Science. Courses within Statistics & Data Science can be any 300 or 400 level course (that is not used to satisfy any other requirement for the statistics major). 

The following is a partial list of courses outside Statistics & Data Science that qualify as electives as they provide the intellectual infrastructure that will advance the student's understanding of statistics and its applications. Other courses may qualify as well; consult with the Statistics Undergraduate Advisor.

15-121Introduction to Data Structures10
15-122Principles of Imperative Computation12
10-301Introduction to Machine Learning (Undergrad)12
10-315Introduction to Machine Learning (SCS Majors)12
15-388Practical Data Science9
21-260Differential Equations9
21-292Operations Research I9
21-301Combinatorics9
21-355Principles of Real Analysis I9
80-220Philosophy of Science9
80-221Philosophy of Social Science9
80-310Formal Logic9
85-310Research Methods in Cognitive Psychology9
85-320Research Methods in Developmental Psychology9
85-340Research Methods in Social Psychology9
88-223Decision Analysis12
88-302Behavioral Decision Making9

Note: Additional prerequisites are required for some of these courses. Students should carefully check the course descriptions to determine if additional prerequisites are necessary.

Mathematical Statistics Track46–52 units
21-127Concepts of Mathematics12
21-355Principles of Real Analysis I9
36-410Introduction to Probability Modeling9

And two of the following:

21-228Discrete Mathematics9
21-257Models and Methods for Optimization9
or 21-292 Operations Research I
21-301Combinatorics9
21-344Numerical Linear Algebra9
21-356Principles of Real Analysis II9
21-373Algebraic Structures9
36-700Probability and Mathematical Statistics12
Total number of units for the major167-199 Units*
Total number of units for the degree360 Units

*Note: This number can vary depending on the courses chosen for the concentration area that a student takes. Speak with an academic advisor for more details.

Recommendations

Students in the Dietrich College of Humanities and Social Sciences who wish to major or minor in Statistics are advised to complete both the calculus requirement (one Mathematical Foundations calculus sequence) and the Beginning Data Analysis course 36-200 by the end of their freshman year.

The linear algebra requirement is a prerequisite for the course 36-401. It is therefore essential that students complete this requirement by their junior years at the latest. 

Recommendations for Prospective Ph.D. Students

Students interested in pursuing a Ph.D. in Statistics or Biostatistics (or related programs) after completing their undergraduate degree are strongly recommended to pursue the Mathematical Statistics Track.

Additional Major in Statistics (Mathematical Science Track)

Students who elect the B.S. in Statistics (Mathematical Science Track) as an additional major must fulfill all Statistics (Mathematical Science Track) degree requirements. With respect to double-counting courses, it is departmental policy that students must have at least five statistics courses that do not count for their primary major. If students do not have at least five, they typically take additional advanced data analysis electives.

Students are advised to begin planning their curriculum (with appropriate advisors) as soon as possible. This is particularly true if the other major has a complex set of requirements and prerequisites or when many of the other major's requirements overlap with the requirements for a B.S. in Statistics (Mathematical Science Track).

Substitutions and Waivers

Many departments require Statistics & Data Science courses as part of their Major or Minor programs. Students seeking transfer credit for those requirements from substitute courses (at Carnegie Mellon or elsewhere) should seek permission from their advisor in the department setting the requirement. The final authority in such decisions rests there. The Department of Statistics & Data Science does not provide approval or permission for substitution or waiver of another department's requirements.

If a waiver or substitution is made in the home department, it is not automatically approved in the Department of Statistics & Data Science. In many of these cases, the student will need to take additional courses to satisfy major requirements. Students should discuss this with a Statistics advisor when deciding whether to add an additional major in Statistics.

Research

The Statistics & Data Science program encourages students to gain research experience. Opportunities within the department include Summer Undergraduate Research Apprenticeships (SURA), run in association with the university's Office of Undergraduate Research and Scholar Development, and the departmental capstone courses 36-490 Undergraduate Research36-493 Sports Analytics Capstone, or 36-497 Corporate Capstone Project. (Note that these courses require an application.) Additionally, students can pursue independent study. For those students who maintain a quality point average of 3.25 overall or above, there is also the Dietrich College Senior Honors Program.

The faculty in the Statistics & Data Science department largely work within the domains of statistical theory and methodological development, areas that require advanced mathematical training. Thus we encourage students to search broadly for research opportunities: faculty, post-doctoral researchers, and graduate students in many departments throughout the university have data to analyze and would welcome the help of undergraduate statistics students.

Sample Programs

The following sample programs illustrate three (of many) ways to satisfy the requirements for the B.S. in Statistics (Mathematical Sciences Track). However, keep in mind that the program is flexible enough to support many other possible schedules and to emphasize a wide variety of interests.

The first schedule uses calculus sequence 1.

The second schedule is an example of the case when a student enters the program through 36-235 and 36-236 (and therefore skips the intermediate data analysis course). This schedule has more emphasis on statistical theory and probability.

SCHEDULE 1

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data36-202 Methods for Statistics & Data Science36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
21-111 Calculus I21-256 Multivariate Analysis21-127 Concepts of Mathematics36-350 Statistical Computing
----- 21-112 Calculus IIOne of the two following courses:21-240 Matrix Algebra with Applications
----- ----- 15-110 Principles of Computing-----
15-112 Fundamentals of Programming and Computer Science

Third-YearFourth-Year
FallSpringFallSpring
36-401 Modern Regression36-402 Advanced Methods for Data Analysis36-46x Special Topics 36-410 Introduction to Probability Modeling
Math Track Elective 36-3xx or 36-4xx Advanced Data Analysis Elective21-355 Principles of Real Analysis IMath Track Elective
---------- ----- -----
---------- ----- -----

Schedule 2

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data21-256 Multivariate Analysis36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
21-120 Differential and Integral CalculusOne of the two following courses: 21-127 Concepts of Mathematics21-241 Matrices and Linear Transformations
----- 15-110 Principles of Computing----- -----
----- 15-112 Fundamentals of Programming and Computer Science----- 36-3xx or 36-4xx Advanced Data Analysis Elective

Third-YearFourth-Year
FallSpringFallSpring
36-350 Statistical Computing36-402 Advanced Methods for Data Analysis36-46x Special Topics 36-410 Introduction to Probability Modeling
36-401 Modern Regression36-3xx or 36-4xx Advanced Data Analysis Elective 21-355 Principles of Real Analysis IMath Track Elective
Math Track Elective ----- ----- -----
---------- ----- -----

B.S. in Statistics (Statistics and Neuroscience Track)

Glenn Clune, Academic Program Manager
Location: Baker Hall 129
statadvising@andrew.cmu.edu

Students in the Bachelor of Science program develop and master a wide array of skills in computing, mathematics, statistical theory, and the interpretation and display of complex data. In addition, Statistics majors gain experience in applying statistical tools to real problems in other fields and learn the nuances of interdisciplinary collaboration. The requirements for the B.S. in Statistics (Neuroscience Track) are detailed below and are organized by categories #1-#7.

Curriculum

1. Mathematical Foundations (Prerequisites)29–42 units

Mathematics is the language in which statistical models are described and analyzed, so some experience with basic calculus and linear algebra is an important component for anyone pursuing a program of study in Statistics & Data Science.

Calculus*

Complete one of the two following sequences of mathematics courses at Carnegie Mellon, each of which provides sufficient preparation in calculus:

Sequence 1
21-111Calculus I10
21-112Calculus II10
And one of the following three courses:
21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11
Sequence 2
21-120Differential and Integral Calculus10
And one of the following three courses:
21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

Notes:

  • Passing the MSC 21-120 assessment test is an acceptable alternative to completing 21-120.

Linear Algebra**

Complete one of the following three courses:

21-240Matrix Algebra with Applications10
21-241Matrices and Linear Transformations11
21-242Matrix Theory11

* It is recommended that students complete the calculus requirement during their freshman year.

**The linear algebra requirement needs to be completed before taking 36-401 Modern Regression

21-241 and 21-242 are intended only for students with a very strong mathematical background.

2. Data Analysis36-45 units

Data analysis is the art and science of extracting insight from data. The art lies in knowing which displays or techniques will reveal the most interesting features of a complicated data set. The science lies in understanding the various techniques and the assumptions on which they rely. Both aspects require practice to master.

The Beginning Data Analysis courses give a hands-on introduction to the art and science of data analysis. The courses cover similar topics but differ slightly in the examples they emphasize. 36-200 draws examples from many fields and satisfies the DC College Core Requirement in Statistical Reasoning. This course is therefore recommended for students in the college. (Note: a score of 5 on the Advanced Placement [AP] Exam in Statistics may be used to waive this requirement). 36-220 emphasizes examples in engineering and architecture.

The Intermediate Data Analysis courses build on the principles and methods covered in the introductory course, and more fully explore specific types of data analysis methods in more depth.

The Advanced Data Analysis courses draw on students' previous experience with data analysis and understanding of statistical theory to develop advanced, more sophisticated methods. These core courses involve extensive analysis of real data with emphasis on developing the oral and writing skills needed for communicating results.

Sequence 1 (For students beginning their freshman or sophomore year)
Beginning*

Choose one of the following courses:

36-200Reasoning with Data *9
36-220Engineering Statistics and Quality Control9

*A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement. 36-220  emphasizes examples in engineering and Architecture.

Note: Students who enter the program with 36-235 or 36-236 should discuss options with an advisor.  

Intermediate*

Choose one of the following courses:

36-202Methods for Statistics & Data Science **9
36-309Experimental Design for Behavioral & Social Sciences9
36-290Introduction to Statistical Research Methodology9
*Or an extra Advanced Data Analysis Elective
** Must take prior to 36-401, if not, an additional Advanced Data Analysis Elective is required
Advanced Data Analysis Electives

Choose one of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

Students can also take a second 36-46x (see section #5).

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9
Sequence 2 (For students beginning later in their college career)
Advanced Data Analysis Electives

Choose two of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

**All Special Topics are not offered every semester, and new Special Topics are regularly added. See section 5 for details.

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9

3. Probability Theory and Statistical Theory18 units

The theory of probability gives a mathematical description of the randomness inherent in our observations. It is the language in which statistical models are stated, so an understanding of probability is essential for the study of statistical theory. Statistical theory provides a mathematical framework for making inferences about unknown quantities from data. The theory reduces statistical problems to their essential ingredients to help devise and evaluate inferential procedures. It provides a powerful and wide-ranging set of tools for dealing with uncertainty.

To satisfy the theory requirement take the following two courses:

Take one of the following courses:
36-235Probability and Statistical Inference I9
36-225Introduction to Probability Theory9
and one of the following three courses:
36-226Introduction to Statistical Inference9
36-236Probability and Statistical Inference II **9
36-326Mathematical Statistics (Honors)9


*It is possible to substitute 36-21836-21936-225 or 21-325 for 36-23536-235 is the standard (and recommended) introduction to probability, 36-219 is tailored for engineers and computer scientists, 36-218 is a more mathematically rigorous class for Computer Science students and more mathematically advanced (students need advisor approval to enroll), and 21-325 is a rigorous probability theory course offered by the Department of Mathematics.

**It is possible to substitute 36-226 or 36-326 (honors course) in place of 36-23636-236 is the standard (and recommended) introduction to statistical inference.

Please note that students who complete 36-235 are expected to take 36-236 to complete their theory requirements. Students who choose to take 36-225 instead will be required to take 36-226 afterward, they will not be eligible to take 36-236.

Comment:

(i) In order to meet the prerequisite requirements, a grade of at least a C is required in 36-235  (or equivalent), 36-236 (or equivalent) and 36-401

4. Statistical Computing19 to 21 units

Fundamental to the practice of statistics and data science is the ability to effectively code data processing and analysis tasks. Within the domain of statistics, the use of the programming language R is ubiquitous, and thus we expose students to it throughout the curriculum (and in depth in Statistical Computing). Within the larger domain of data science, the use of the programming language Python is also ubiquitous, and thus we require all majors to gain, at a minimum, basic competency in the language by taking either Principles of Computing, or Fundamentals of Programming and Computer Science. We would advise those students who are considering receiving course credit for one of these two courses given their score on the AP Computer Science A exam to actually take one (or both) of them at Carnegie Mellon instead, as within data science as a whole Python is far more widely used than Java. 

Take one of the two following courses:
15-110Principles of Computing10
15-112Fundamentals of Programming and Computer Science12
Complete the following course:
36-350Statistical Computing9

5. Special Topics9 units

The Department of Statistics & Data Science offers advanced courses that focus on specific statistical applications or advanced statistical methods. These courses are numbered 36-46x (36-461, 36-462, etc.). The objective of the course is to expose students to important topics in statistics and/or interesting applications which are not part of the standard undergraduate curriculum. Note that all Special Topics are not offered every semester, and new Special Topics are regularly added.

To satisfy the Special Topics requirement choose one of the 36-46x courses (which are 9 units).

Note: All 36-46x courses require 36-401 as a prerequisite or corequisite.

6. Statistical Elective9–12 units

Students are required to take one elective which can be within or outside the Department of Statistics & Data Science. Courses within Statistics & Data Science can be any 300 or 400 level course (that is not used to satisfy any other requirement for the statistics major). 

The following is a partial list of courses outside Statistics & Data Science that qualify as electives as they provide the intellectual infrastructure that will advance the student's understanding of statistics and its applications. Other courses may qualify as well; consult with the Statistics Undergraduate Advisor.

15-121Introduction to Data Structures10
15-122Principles of Imperative Computation12
10-301Introduction to Machine Learning (Undergrad)12
10-315Introduction to Machine Learning (SCS Majors)12
15-388Practical Data Science9
21-127Concepts of Mathematics12
21-260Differential Equations9
21-292Operations Research I9
21-301Combinatorics9
21-355Principles of Real Analysis I9
80-220Philosophy of Science9
80-221Philosophy of Social Science9
80-310Formal Logic9
85-310Research Methods in Cognitive Psychology9
85-320Research Methods in Developmental Psychology9
85-340Research Methods in Social Psychology9
88-223Decision Analysis12
88-302Behavioral Decision Making9
Statistics and Neuroscience Track45–54 UNITS
85-211Cognitive Psychology9
85-219Foundations of Brain and Behavior9

And three electives (at least one from Methodology and Analysis and at least one within the Neuroscience Background listed below):

Methodology and Analysis
10-301Introduction to Machine Learning (Undergrad)12
18-290Signals and Systems12
36-700Probability and Mathematical Statistics12
42/86-631Neural Data Analysis12
85-314Cognitive Neuroscience Research Methods9
Neuroscience Background
03-362Cellular Neuroscience9
03-363Systems Neuroscience9
15-386Neural Computation9
85-414Cognitive Neuropsychology9
85-419Introduction to Parallel Distributed Processing9
Total Number of Units for the Major:165-201* Units
Total Number of Units for the Degree:360 Units

*Note: This number can vary depending on the courses chosen for the concentration area that a student takes. Speak with an academic advisor for more details.

Recommendations

Students in the Dietrich College of Humanities and Social Sciences who wish to major or minor in Statistics are advised to complete both the calculus requirement (one Mathematical Foundations calculus sequence) and the Beginning Data Analysis course 36-200 by the end of their freshman year.

The linear algebra requirement is a prerequisite for the course 36-401. It is therefore essential that students complete this requirement by their junior years at the latest.

Recommendations for Prospective Ph.D. Students

Students interested in pursuing a Ph.D. in Statistics or Biostatistics (or related programs) after completing their undergraduate degree are strongly recommended to pursue the Mathematical Statistics Track or to take additional Mathematics courses. Although 21-240 Matrix Algebra with Applications is recommended for Statistics majors, students interested in PhD programs should consider taking 21-241 Matrices and Linear Transformations or 21-242 Matrix Theory instead. Additional courses to consider are 21-228 Discrete Mathematics21-341 Linear Algebra21-355 Principles of Real Analysis I, and 21-356 Principles of Real Analysis II.

Additional Major in Statistics (Neuroscience Track)

Students who elect the B.S. in Statistics (Neuroscience Track) as an additional major must fulfill all Statistics (Neuroscience Track) degree requirements. With respect to double-counting courses, it is departmental policy that students must have at least five statistics courses that do not count for their primary major. If students do not have at least five, they take additional advanced data analysis electives.

Students are advised to begin planning their curriculum (with appropriate advisors) as soon as possible. This is particularly true if the other major has a complex set of requirements and prerequisites or when many of the other major's requirements overlap with the requirements for the B.S. in Statistics (Neuroscience Track).

Substitutions and Waivers

Many departments require Statistics & Data Science courses as part of their Major or Minor programs. Students seeking transfer credit for those requirements from substitute courses (at Carnegie Mellon or elsewhere) should seek permission from their advisor in the department setting the requirement. The final authority in such decisions rests there. The Department of Statistics & Data Science does not provide approval or permission for substitution or waiver of another department's requirements.

If a waiver or substitution is made in the home department, it is not automatically approved in the Department of Statistics & Data Science. In many of these cases, the student will need to take additional courses to satisfy major requirements. Students should discuss this with a Statistics advisor when deciding whether to add an additional major in Statistics.

Research

The Statistics & Data Science program encourages students to gain research experience. Opportunities within the department include Summer Undergraduate Research Apprenticeships (SURA), run in association with the university's Office of Undergraduate Research and Scholar Development, and the departmental capstone courses 36-490 Undergraduate Research36-493 Sports Analytics Capstone, or 36-497 Corporate Capstone Project. (Note that these courses require an application.) Additionally, students can pursue independent study. For those students who maintain a quality point average of 3.25 overall or above, there is also the Dietrich College Senior Honors Program.

The faculty in the Statistics & Data Science department largely work within the domains of statistical theory and methodological development, areas that require advanced mathematical training. Thus we encourage students to search broadly for research opportunities: faculty, post-doctoral researchers, and graduate students in many departments throughout the university have data to analyze and would welcome the help of undergraduate statistics students.

Sample Programs

The following sample programs illustrate three (of many) ways to satisfy the requirements for the B.S. in Statistics (Neuroscience Track). However, keep in mind that the program is flexible enough to support many other possible schedules and to emphasize a wide variety of interests.

The first schedule uses calculus sequence 2.

The second schedule is an example of the case when a student enters the program through 36-235 and 36-236 (and therefore skips the intermediate data analysis course). This schedule has more emphasis on statistical theory and probability.

schedule 1

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data36-202 Methods for Statistics & Data Science36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
21-120 Differential and Integral Calculus21-256 Multivariate Analysis85-219 Foundations of Brain and Behavior36-350 Statistical Computing
85-211 Cognitive PsychologyAnd one of the following two courses:----- 21-240 Matrix Algebra with Applications
----- 15-110 Principles of Computing----- -----
15-112 Fundamentals of Programming and Computer Science

Third-YearFourth-Year
FallSpringFallSpring
36-401 Modern Regression36-402 Advanced Methods for Data Analysis36-46x Special Topics 36-3xx or 36-4xx Advanced Data Analysis Elective
Neuroscience Track Elective Neuroscience Track Elective Neuroscience Track Elective -----
----- ----- ----- -----
----- ----- ----- -----

Schedule 2

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data36-202 Methods for Statistics & Data Science21-256 Multivariate Analysis21-240 Matrix Algebra with Applications
21-111 Calculus I21-112 Calculus II85-211 Cognitive Psychology36-3xx or 36-4xx Advanced Data Analysis Elective
----- Take one of the following two courses:----- -----
----- 15-110 Principles of Computing----- -----
15-112 Fundamentals of Programming and Computer Science

Third-YearFourth-Year
FallSpringFallSpring
36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II36-401 Modern Regression36-402 Advanced Methods for Data Analysis
85-219 Foundations of Brain and BehaviorNeuroscience Track Elective 36-350 Statistical Computing36-46x - Special Topics
----- Neuroscience Track Elective Neuroscience Track Elective
----- 36-3xx or 36-4xx Advanced Data Analysis Elective-----

B.S. in Economics and Statistics

Amanda Mitchell, Statistics & Data Science Academic Program Manager
Stephen Pajewski, Economics Senior Academic Advisor and Program Manager

Statistics & Data Science Location: Baker Hall 129
statadvising@andrew.cmu.edu

Economics Location: Tepper 2400
econprog@andrew.cmu.edu


The B.S. in Economics and Statistics is jointly advised by the Department of Statistics and Data Science and the Undergraduate Economics Program.

The Major in Economics and Statistics provides an interdisciplinary course of study aimed at students with a strong interest in the empirical analysis of economic data. With joint curriculum from the Department of Statistics and Data Science and the Undergraduate Economics Program, the major provides students with a solid foundation in the theories and methods of both fields. Students in this major are trained to advance the understanding of economic issues through the analysis, synthesis and reporting of data using the advanced empirical research methods of statistics and econometrics. Graduates are well positioned for admission to competitive graduate programs, including those in statistics, economics and management, as well as for employment in positions requiring strong analytic and conceptual skills - especially those in economics, finance, education, and public policy.

All economics courses counting towards an economics degree must be completed with a grade of "C" or higher. 

Curriculum

The requirements for the B.S. in Economics and Statistics are the following:

1. MATHEMATICAL FOUNDATIONS (PREREQUISITES)29-42 UNITS

Mathematics is the language in which statistical models are described and analyzed, so some experience with basic calculus and linear algebra is an important component for anyone pursuing a program of study in Economics and Statistics.

CALCULUS

Complete one of the two following sequences of mathematics courses at Carnegie Mellon, each of which provides sufficient preparation in calculus:

SEQUENCE 1

21-111Calculus I10
21-112Calculus II10

and one of the following:

21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

SEQUENCE 2

21-120Differential and Integral Calculus10

and one of the following:

21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

NOTES:


  • Passing the MSC 21-120 assessment test is an acceptable alternative to completing 21-120.

Note: Taking/having credit for both 21-111 and 21-112 is equivalent to 21-120. The Mathematical Foundations total is then 48-49 units. The Economics and Statistics major would then total 201-211 units.

Linear Algebra

One of the following three courses:

21-240Matrix Algebra with Applications10
21-241Matrices and Linear Transformations11
21-242Matrix Theory11

Note21-241 and 21-242 are intended only for students with a very strong mathematical background.

II. Foundations54 units

2. Economics Foundations18 UNITS
Take one of the following courses:
73-102Principles of Microeconomics *9
73-104Principles of Microeconomics Accelerated **9
Take the following course:
73-103Principles of Macroeconomics9

 *Students who place out of 73-102 based on the economics placement exam will receive a pre-req waiver for 73-102 and are waived from taking 73-102

**This course requires students to complete a 4 or 5 on the AP Microeconomics exam or qualifying score on the IB/Cambridge Exams. 73-104 will substitute for any 73-102 prerequisite requirement in other courses. 73-104 is a more rigorous introduction to microeconomics, is taught at a faster pace than 73-102, and dives a bit deeper into key topics. It is designed for students who have prior knowledge to fundamental economic concepts through AP/IB/Cambridge coursework. Enrollment in 73-104 requires special permission. Students who wish to take this course should add themselves to the 73-104 waitlist once registration opens. The Tepper School will verify the advancement placement scores and will enroll students in 73-104

3. Statistical Foundations36 UNITS
DATA ANALYSIS

Data analysis is the art and science of extracting insight from data. The art lies in knowing which displays or techniques will reveal the most interesting features of a complicated data set. The science lies in understanding the various techniques and the assumptions on which they rely. Both aspects require practice to master.

The Beginning Data Analysis courses give a hands-on introduction to the art and science of data analysis. The courses cover similar topics but differ slightly in the examples they emphasize. 36-200 draws examples from many fields and satisfy the DC College Core Requirement in Statistical Reasoning. This course is therefore recommended for students in the college. (Note: a score of 5 on the Advanced Placement [AP] Exam in Statistics may be used to waive this requirement). 36-220 emphasizes examples in engineering.

The Intermediate Data Analysis courses build on the principles and methods covered in the introductory course, and more fully explore specific types of data analysis methods in more depth.

The Advanced Data Analysis courses draw on students' previous experience with data analysis and understanding of statistical theory to develop advanced, more sophisticated methods. These core courses involve extensive analysis of real data with emphasis on developing the oral and writing skills needed for communicating results.

Sequence 1 (For students beginning their freshman or sophomore year)
Beginning*

Choose one of the following courses:

36-200Reasoning with Data *9
36-220Engineering Statistics and Quality Control9

*A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement. 36-220 emphasizes examples in engineering and Architecture.

Note: Students who enter the program with 36-235 or 36-236 should discuss options with an advisor.  Any 36-300 or 36-400 level course in Data Analysis that does not satisfy any other requirement for the Economics and Statistics Major may be counted as a Statistical Elective.

Intermediate*

Choose one of the following courses:

36-202Methods for Statistics & Data Science **9
36-290Introduction to Statistical Research Methodology9
36-309Experimental Design for Behavioral & Social Sciences9
*

Or extra data analysis course in Statistics

**

Must take prior to 36-401 Modern Regression, if not, an additional Advanced Statistics Elective is required.

Advanced Statistics Elective
Choose two of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9
Sequence 2 (For students beginning later in their college career)

Advanced Statistics Electives
Choose three of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

**All Special Topics are not offered every semester, and new Special Topics are regularly added. See section 5 for details.

III. Disciplinary Core136-139 units

1. Economics Core45 UNITS
73-230Intermediate Microeconomics9
73-240Intermediate Macroeconomics9
73-270Professional Communication for Economists9
73-265Economics and Data Science9
73-274Econometrics I9
73-374Econometrics II9
2. Statistics Core36 UNITS
Take one of the following courses:
36-235Probability and Statistical Inference I *#9
36-225Introduction to Probability Theory9
Take one of the following courses:
36-236Probability and Statistical Inference II **9
36-226Introduction to Statistical Inference9
36-326Mathematical Statistics (Honors)9
Take both of the following courses:
36-401Modern Regression9
36-402Advanced Methods for Data Analysis9

 *In order meet the prerequisite requirements for the major, a grade of C or better is required in 36-235 (or equivalents), 36-236 or 36-326 and 36-401.

#It is possible to substitute 36-218,  36-21936-225or 21-325 for 36-23536-235 is the standard introduction to probability, 36-219 is tailored for engineers and computer scientists, 36-218 is a more mathematically rigorous class for Computer Science students and more mathematically advanced Statistics students (Statistics students need advisor approval to enroll), and 21-325 is a rigorous Probability Theory course offered by the Department of Mathematics.

**It is possible to substitute 36-226 or 36-326 for 36-23636-236 is the standard introduction to statistical inference.

Please note that students who complete 36-235 are expected to take 36-236 to fulfill their theory requirements. Students who choose to take 36-225 instead will be required to take 36-226 afterward, they will not be eligible to take 36-236.

3. Statistical Computing19-21 UNITS
Take one of the following two courses:
15-110Principles of Computing10
15-112Fundamentals of Programming and Computer Science12
Complete the following course:
36-350Statistical Computing9
4. Advanced Electives36 units

Students must take two advanced Economics elective courses (numbered 73-300 through 73-495, excluding 73-374 ) and two (or three - depending on previous coursework, see Section 3) advanced Statistics elective courses (numbered 36-30336-311,  36-313,  36-31536-318, 36-46x, 36-49036-493 or 36-497).

Total number of units for the major219-235 Units
Total number of units for the degree360 Units

Professional Development

While not required, students are strongly encouraged to take advantage of professional development opportunities and/or coursework. One option is  , a fall-only course that provides information about careers in Economics, job search strategies, and research opportunities. The Department of Statistics and Data Science also offers a series of workshops pertaining to resume preparation, graduate school applications, careers in the field, among other topics. Students should also take advantage of the Career and Professional Development Center. 


Additional Major in Economics and Statistics

Students who elect Economics and Statistics as an additional major must fulfill all Economics and Statistics degree requirements. Majors in many other programs would naturally complement an Economics and Statistics Major, including Tepper's undergraduate business program, Social and Decision Sciences, Policy and Management, and Psychology.

With respect to double-counting courses, it is departmental policy that students must have at least six courses [three Economics (73-xxx) and three Statistics (36-xxx)] that do not count for their primary major. If students do not have at least three ECON and three STA classes, they will need to take additional advanced data analysis or economics electives, depending on where the double-counting issue is.

Students are advised to begin planning their curriculum (with appropriate advisors) as soon as possible. This is particularly true if the other major has a complex set of requirements and prerequisites or when many of the other major's requirements overlap with the requirements for a Major in Economics and Statistics.

Substitutions and Waivers

Many departments require Statistics courses as part of their Major or Minor programs. Students seeking transfer credit for those requirements from substitute courses (at Carnegie Mellon or elsewhere) should seek permission from their advisor in the department setting the requirement. The final authority in such decisions rests there. The Department of Statistics and Data Science does not provide approval or permission for substitution or waiver of another department's requirements.

If a waiver or substitution is made in the home department, it is not automatically approved in the Department of Statistics and Data Science. In many of these cases, the student will need to take additional courses to satisfy the Economics and Statistics major requirements. Students should discuss this with a Statistics advisor when deciding whether to add an additional major in Economics and Statistics.

Sample Program

The following sample program illustrates one way to satisfy the requirements of the Economics and Statistics Major.  Keep in mind that the program is flexible and can support other possible schedules (see footnotes below the schedule).

First-YearSecond-Year
FallSpringFallSpring
21-120 Differential and Integral Calculus36-202 Methods for Statistics & Data Science36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
36-200 Reasoning with Data21-256 Multivariate Analysis73-230 Intermediate Microeconomics21-240 Matrix Algebra with Applications
73-102 Principles of Microeconomics73-103 Principles of Macroeconomics 73-240 Intermediate Macroeconomics
15-110 Principles of Computing-----73-265 Economics and Data Science73-274 Econometrics I
---------------
-----

Third-YearFourth-Year
FallSpringFallSpring
36-350 Statistical Computing36-402 Advanced Methods for Data Analysis36-3xx or 36-4xx Advanced Data Analysis Elective36-3xx or 36-4xx Advanced Data Analysis Elective
36-401 Modern Regression73-270 Professional Communication for EconomistsEconomics ElectiveEconomics Elective
73-374 Econometrics II---------------
--------------------
---------------

*In each semester, ----- represents other courses (not related to the major) which are needed in order to complete the 360 units that the degree requires.

Prospective PhD students are advised to add 21-127 fall of sophomore year, replace 21-240 with 21-241, add 21-260 in spring of junior year and 21-355 in fall of senior year.

B.S. in Statistics and Machine Learning

Amanda Mitchell, Academic Program Manager

Location: Baker Hall 129
statadvising@andrew.cmu.edu

Students in the Statistics and Machine Learning program develop and master a wide array of skills in computing, mathematics, statistical theory, and the interpretation and display of complex data. In addition, Statistics and Machine Learning majors gain experience in applying statistical tools to real problems in other fields and learn the nuances of interdisciplinary collaboration. This program is geared towards students interested in statistical computation, data science, or “Big Data” problems.  The requirements for the B.S. in Statistics and Machine Learning are detailed below and are organized by categories.

Curriculum

1. Mathematical Foundations (Prerequisites)51–64 units

Mathematics is the language in which statistical models are described and analyzed, so some experience with basic calculus and linear algebra is an important component for anyone pursuing a program of study in Statistics and Machine Learning.

Calculus*

Complete one of the following sequences of mathematics courses at Carnegie Mellon, each of which provides sufficient preparation in calculus:

Sequence 1
21-111Calculus I10
21-112Calculus II10

and one of the following:

21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11
Sequence 2
21-120Differential and Integral Calculus10

and one of the following:

21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

Notes:

  • Passing the Mathematical Sciences 21-120 assessment test is an acceptable alternative to completing 21-120
Integration and Approximation
21-122Integration and Approximation10
Linear Algebra**

Complete one of the following three courses:

21-240Matrix Algebra with Applications10
21-241Matrices and Linear Transformations11
21-242Matrix Theory11

* It is recommended that students complete the calculus requirement during their freshman year.

**The linear algebra requirement needs to be completed before taking 36-401 Modern Regression

21-241 and 21-242 are intended only for students with a very strong mathematical background.

Mathematical Theory
21-127Concepts of Mathematics12

2. Data Analysis45–54 units

Data analysis is the art and science of extracting insight from data. The art lies in knowing which displays or techniques will reveal the most interesting features of a complicated data set. The science lies in understanding the various techniques and the assumptions on which they rely. Both aspects require practice to master.

The Beginning Data Analysis courses give a hands-on introduction to the art and science of data analysis. The courses cover similar topics but differ slightly in the examples they emphasize. 36-200 draws examples from many fields and satisfy the Dietrich College Core Requirement in Statistical Reasoning. One of these courses is therefore recommended for students in the college. (Note: a score of 5 on the Advanced Placement [AP] Exam in Statistics may be used to waive this requirement). 36-220  emphasizes examples in engineering and architecture.

The Intermediate Data Analysis courses build on the principles and methods covered in the introductory course, and more fully explore specific types of data analysis methods in more depth.

The Advanced Data Analysis courses draw on students' previous experience with data analysis and understanding of statistical theory to develop advanced, more sophisticated methods. These core courses involve extensive analysis of real data with emphasis on developing the oral and writing skills needed for communicating results.

Sequence 1 (For students beginning their freshman or sophomore year)
Beginning*

Choose one of the following courses:

36-200Reasoning with Data *9
36-220Engineering Statistics and Quality Control9

*A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement. 36-220 emphasizes examples in engineering and Architecture.

Note: Students who enter the program with 36-235 or 36-236 should discuss options with an advisor. 

Intermediate*

Choose one of the following courses:

36-202Methods for Statistics & Data Science **9
36-309Experimental Design for Behavioral & Social Sciences9
36-290Introduction to Statistical Research Methodology9
*Or an extra Advanced Data Analysis Elective
**Must take prior to 36-401 or will need to take an additional Advanced Data Analysis Elective
Advanced Data Analysis Electives

Choose two of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

All Special Topics are not offered every semester. They are on a rotation and new Special Topics are regularly added.

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9
Sequence 2 (For students beginning later in their college career)
Advanced Data Analysis Electives

Choose three of the following courses:

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

All Special Topics are not offered every semester. They are on a rotation and new Special Topics are regularly added.

and take the following two courses:

36-401Modern Regression9
36-402Advanced Methods for Data Analysis9

3. Probability Theory and Statistical Theory18 units

The theory of probability gives a mathematical description of the randomness inherent in our observations. It is the language in which statistical models are stated, so an understanding of probability is essential for the study of statistical theory. Statistical theory provides a mathematical framework for making inferences about unknown quantities from data. The theory reduces statistical problems to their essential ingredients to help devise and evaluate inferential procedures. It provides a powerful and wide-ranging set of tools for dealing with uncertainty.

To satisfy the theory requirement take the following two courses**:

Take one of the following courses:
36-235Probability and Statistical Inference I *9
36-225Introduction to Probability Theory9
And one of the three following courses:
36-226Introduction to Statistical Inference9
36-236Probability and Statistical Inference II **9
36-326Mathematical Statistics (Honors)9


*It is possible to substitute 36-218, 36-219, 36-225or 21-325 for 36-23536-235 is the standard (and recommended) introduction to probability, 36-219 is tailored for engineers and computer scientists, 36-218 is a more mathematically rigorous class for Computer Science students and more mathematically advanced (students need advisor approval to enroll), and 21-325 is a rigorous Probability Theory course offered by the Department of Mathematics.) 

**It is possible to substitute 36-226 or 36-326(honors course) for 36-23636-236 is the standard (and recommended) introduction to statistical inference.

Please note that students who take 36-235 are expected to take 36-236 to complete their theory requirements. Students who choose to take 36-225 instead will be required to take 36-226 afterward, they will not be eligible to take 36-236.

Comments:

(i) In order to meet the prerequisite requirements, a grade of at least a C is required in 36-235  (or equivalent), 36-236  (or equivalent) and 36-401

4. Statistical Computing9 units

Fundamental to the practice of statistics and data science is the ability to effectively code data processing and analysis tasks. Within the domain of statistics, the use of the programming language R is ubiquitous, and thus we expose students to it throughout the curriculum (and in depth in Statistical Computing). 

36-350Statistical Computing9

5. Machine Learning/Computer Science57-60 units

Statistical modeling in practice nearly always requires computation in one way or another. Computational algorithms are sometimes treated as “black boxes," whose innards the statistician need not pay attention to. But this attitude is becoming less and less prevalent, and today there is much to be gained from a strong working knowledge of computational tools. Understanding the strengths and weaknesses of various methods allows the data analyst to select the right tool for the job; understanding how they can be adapted to work in new settings greatly extends the realm of problems that he/she can solve. While all majors in Statistics & Data Science are given solid grounding in computation, extensive computational training is really what sets the B.S. in Statistics and Machine Learning program apart. Note that we would advise those students who are considering receiving course credit for Fundamentals of Programming and Computer Science given their score on the AP Computer Science A exam to actually take the course at Carnegie Mellon instead, as within data science as a whole Python is far more widely used than Java.

15-112Fundamentals of Programming and Computer Science12
15-122Principles of Imperative Computation12
15-351Algorithms and Advanced Data Structures12
or 15-451 Algorithm Design and Analysis
10-301Introduction to Machine Learning (Undergrad)12
or 10-315 Introduction to Machine Learning (SCS Majors)

and take one of the following Machine Learning Advanced Electives:

05-434Machine Learning in Practice12
10-403Deep Reinforcement Learning & Control12
10-703Deep Reinforcement Learning & Control12
10-405Machine Learning with Large Datasets (Undergraduate)12
10-605Machine Learning with Large Datasets12
10-417Intermediate Deep Learning12
10-418Machine Learning for Structured Data12
10-707Advanced Deep Learning12
11-344Machine Learning in Practice12
11-411Natural Language Processing12
11-441Machine Learning for Text and Graph-based Mining9
11-485Introduction to Deep Learning9
11-661Language and Statistics12
11-761Language and Statistics12
15-281Artificial Intelligence: Representation and Problem Solving12
15-386Neural Computation9
15-387Computational Perception9
16-311Introduction to Robotics12
16-385Computer Vision12
16-720Computer Vision12
*PhD level ML course as approved by Statistics advisor
** Independent research with an ML faculty member as approved by Statistics Advisor
***This is not an exhaustive list. Please contact your Academic Advisor if there is a course you are considering taking that is not on this list.
Total number of units for the major180–205 Units
Total number of units for the degree360 Units

Recommendations

Students in the Dietrich College of Humanities and Social Sciences who wish to major or minor in Statistics are advised to complete both the calculus requirement (one Mathematical Foundations calculus sequence) and the Beginning Data Analysis course 36-200 Reasoning with Data by the end of their Freshman year.

The linear algebra requirement is a prerequisite for the course 36-401 . It is therefore essential that students complete this requirement by their junior years at the latest. 

Recommendations for Prospective Ph.D. Students

Students interested in pursuing a Ph.D. in Statistics or Machine Learning (or related programs) after completing their undergraduate degree are strongly recommended to take additional Mathematics courses. Although 21-240 Matrix Algebra with Applications is recommended for Statistics majors, students interested in PhD programs should consider taking 21-241 Matrices and Linear Transformations or 21-242 Matrix Theory instead. Additional courses to consider are 21-228 Discrete Mathematics, 21-341 Linear Algebra, 21-355 Principles of Real Analysis I, and 21-356 Principles of Real Analysis II.

Additional experience in programming and computational modeling is also recommended. Students should consider taking more than one course from the list of Machine Learning electives provided under the Computing section.

Additional Major in Statistics and Machine Learning

Students who elect Statistics and Machine Learning as a second or third major must fulfill all degree requirements. 

With respect to double-counting courses, it is departmental policy that students must have at least six courses (three Computer Science/Machine Learning and three Statistics) that do not count for their primary major. If students do not have at least six, they will need to take additional advanced data analysis or ML electives, depending on where the double counting issue is.

Students are advised to begin planning their curriculum (with appropriate advisors) as soon as possible. This is particularly true if the other major has a complex set of requirements and prerequisites or when many of the other major's requirements overlap with the requirements for the B.S. in Statistics and Machine Learning.

Substitutions and Waivers

Many departments require Statistics & Data Science courses as part of their Major or Minor programs. Students seeking transfer credit for those requirements from substitute courses (at Carnegie Mellon or elsewhere) should seek permission from their advisor in the department setting the requirement. The final authority in such decisions rests there. The Department of Statistics & Data Science does not provide approval or permission for substitution or waiver of another department's requirements.

If a waiver or substitution is made in the home department, it is not automatically approved in the Department of Statistics & Data Science. In many of these cases, the student will need to take additional courses to satisfy major requirements. Students should discuss this with a Statistics advisor when deciding whether to add an additional major in Statistics and Machine Learning.

Research

The Statistics & Data Science program encourages students to gain research experience. Opportunities within the department include Summer Undergraduate Research Apprenticeships (SURA), run in association with the university's Office of Undergraduate Research and Scholar Development, and the departmental capstone courses 36-490 Undergraduate Research36-493 Sports Analytics Capstone, or 36-497 Corporate Capstone Project. (Note that these courses require an application.) Additionally, students can pursue independent study. For those students who maintain a quality point average of 3.25 overall or above, there is also the Dietrich College Senior Honors Program.

The faculty in the Statistics & Data Science department largely work within the domains of statistical theory and methodological development, areas that require advanced mathematical training. Thus we encourage students to search broadly for research opportunities: faculty, post-doctoral researchers, and graduate students in many departments throughout the university have data to analyze and would welcome the help of undergraduate statistics students.

Sample Programs

The following sample program illustrates one way to satisfy the requirements for the B.S. in Statistics and Machine Learning.  Keep in mind that the program is flexible and can support other possible schedules (see footnotes below the schedule). Sample program 1 is for students who have not satisfied the basic calculus requirements. Sample program 2 is for students who have satisfied the basic calculus requirements and choose option 2 for their data analysis courses (see section #2)

Schedule 1

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data36-202 Methods for Statistics & Data Science36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
21-120 Differential and Integral Calculus21-256 Multivariate Analysis21-122 Integration and Approximation21-241 Matrices and Linear Transformations
-----15-112 Fundamentals of Programming and Computer Science21-127 Concepts of Mathematics15-122 Principles of Imperative Computation
----- ----- ----- 36-350 Statistical Computing
----- ----- ----- -----

Third-YearFourth-Year
FallSpringFallSpring
36-401 Modern Regression36-402 Advanced Methods for Data Analysis10-301 Introduction to Machine Learning (Undergrad)Machine Learning Advanced Elective
----- 15-351 Algorithms and Advanced Data Structures36-3xx or 36-4xx Advanced Data Analysis Elective36-3xx or 36-4xx Advanced Data Analysis Elective
----- ----- ----- -----
----- ----- ----- -----
----- ----- ----- -----

*In each semester, ----- represents other courses (not related to the major) which are needed in order to complete the 360 units that the degree requires.

Schedule 2

First-YearSecond-Year
FallSpringFallSpring
36-200 Reasoning with Data21-127 Concepts of Mathematics36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
21-256 Multivariate Analysis----- 15-122 Principles of Imperative Computation21-241 Matrices and Linear Transformations
15-112 Fundamentals of Programming and Computer Science----- ----- 36-3xx or 36-4xx Advanced Data Analysis Elective
----- ----- ----- -----
----- ----- ----- -----

Third-YearFourth-Year
FallSpringFallSpring
36-350 Statistical Computing36-402 Advanced Methods for Data Analysis10-301 Introduction to Machine Learning (Undergrad)Machine Learning Advanced Elective
36-401 Modern Regression15-351 Algorithms and Advanced Data Structures36-3xx or 36-4xx Advanced Data Analysis Elective36-3xx or 36-4xx Advanced Data Analysis Elective
----- ----- ----- -----
----- ----- ----- -----
----- ----- ----- -----

*In each semester, "-----" represents other courses (not related to the major) which are needed in order to complete the 360 units that the degree requires.
 

The Minor in Statistics

Sylvie Aubin, Undergraduate Academic Advisor
Location: Baker Hall 129
statadvising@stat.cmu.edu

The Minor in Statistics develops skills that complement major study in other disciplines. The program helps the student master the basics of statistical theory and advanced techniques in data analysis. This is a good choice for deepening understanding of statistical ideas and for strengthening research skills.

In order to get a minor in Statistics a student must satisfy all of the following requirements:

1. Mathematical Foundations (Prerequisites)29–41 units

Calculus:*:

Complete one of the following two sequences of mathematics courses at Carnegie Mellon, each of which provides sufficient preparation in calculus:

Sequence 1
21-111Calculus I10
21-112Calculus II10

and one of the following:

21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11
Sequence 2
21-120Differential and Integral Calculus10

and one of the following:

21-256Multivariate Analysis9
21-259Calculus in Three Dimensions10
21-268Multidimensional Calculus11

Note: Passing the Mathematical Sciences 21-120 assessment test if an acceptable alternative to completing 21-120.

Linear Algebra:

Complete one of the following three courses:

21-240Matrix Algebra with Applications10
21-241Matrices and Linear Transformations11
21-242Matrix Theory11

*It is recommended that students complete the calculus requirement during their freshman year. 

**The linear algebra requirement needs to be complete before taking 36-401 Modern Regression or 36-46X Special Topics.

21-241 and 21-242 are intended only for students with a very strong mathematical background.

2. Data Analysis36 units

Data analysis is the art and science of extracting insight from data. The art lies in knowing which displays or techniques will reveal the most interesting features of a complicated data set. The science lies in understanding the various techniques and the assumptions on which they rely. Both aspects require practice to master.

The Beginning Data Analysis courses give a hands-on introduction to the art and science of data analysis. The courses cover similar topics but differ slightly in the examples they emphasize. 36-200 draws examples from many fields and satisfy the DC College Core Requirement in Statistical Reasoning. One of these courses is therefore recommended for students in the College. (Note: A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement). Other courses emphasize examples in engineering and architecture (36-220 ), and the laboratory sciences (36-247 ).

The Intermediate Data Analysis courses build on the principles and methods covered in the introductory course, and more fully explore specific types of data analysis methods in more depth.

The Advanced Data Analysis and Methodology courses draw on students' previous experience with data analysis and understanding of statistical theory to develop advanced, more sophisticated methods. These core courses involve extensive analysis of real data with emphasis on developing the oral and writing skills needed for communicating results.

Sequence 1 (For students beginning their freshman or sophomore year)
Beginning Data Analysis*

Choose one of the following courses: 

36-200Reasoning with Data *9
36-220Engineering Statistics and Quality Control9
36-247Statistics for Lab Sciences9

*A score of 5 on the Advanced Placement (AP) Exam in Statistics may be used to waive this requirement. Other courses emphasize examples in engineering and Architecture (36-220) and the laboratory sciences (36-247).

Intermediate Data Analysis*

Choose one of the following courses:

36-202Methods for Statistics & Data Science **9
36-290Introduction to Statistical Research Methodology9
36-309Experimental Design for Behavioral & Social Sciences9

*The Beginning and Intermediate Data Analysis sequence (i.e. 36-200 and 36-202, or equivalents as listed above) can be replaced with an additional Advanced Analysis and Methodology course, shown below in Sequence 2.

**Must take the Intermediate Data Analysis requirement prior to 36-401, if not, an additional Advanced Analysis and Methodology course is required.

Advanced Data Analysis and Methodology

Take the following course:

36-401Modern Regression9

 and one of the following courses:

36-402Advanced Methods for Data Analysis9
36-410Introduction to Probability Modeling9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

Special Topics rotate and new ones are regularly added.

Sequence 2 (For students beginning later in their college career)
Advanced Data Analysis and Methodology

Take the following course:

36-401Modern Regression9

 and take two of the following courses (one of which must be 400-level):

36-303Sampling, Survey and Society9
36-311Statistical Analysis of Networks9
36-313Statistics of Inequality and Discrimination9
36-315Statistical Graphics and Visualization9
36-318Introduction to Causal Inference9
36-402Advanced Methods for Data Analysis9
36-410Introduction to Probability Modeling9
36-460Special Topics: Sports Analytics9
36-461Special Topics: Statistical Methods in Epidemiology9
36-462Special Topics: Methods of Statistical Learning9
36-463Special Topics: Multilevel and Hierarchical Models9
36-464Special Topics: Psychometrics: A Statistical Modeling Approach9
36-465Special Topics: Conceptual Foundations of Statistical Learning9
36-466Special Topics: Statistical Methods in Finance9
36-467Special Topics: Data over Space & Time9
36-468Special Topics: Text Analysis9
36-469Special Topics: Statistical Genomics and High Dimensional Inference9
36-490Undergraduate Research9
36-493Sports Analytics Capstone9
36-497Corporate Capstone Project9

Special Topics rotate and new ones are regularly added.

3. Probability Theory and Statistical Theory18 units

The theory of probability gives a mathematical description of the randomness inherent in our observations. It is the language in which statistical models are stated, so an understanding of probability is essential for the study of statistical theory. Statistical theory provides a mathematical framework for making inferences about unknown quantities from data. The theory reduces statistical problems to their essential ingredients to help devise and evaluate inferential procedures. It provides a powerful and wide-ranging set of tools for dealing with uncertainty.

To satisfy the theory requirement take the following two courses:

Take one of the following courses:
36-235Probability and Statistical Inference I *9
36-225Introduction to Probability Theory9
And one of the following three courses:
36-236Probability and Statistical Inference II **9
36-226Introduction to Statistical Inference9
36-326Mathematical Statistics (Honors)9

*It is possible to substitute 36-21836-219 , 36-225or 21-325 for 36-235 . (36-235 is the standard (and recommended) introduction to probability, 36-219 is tailored for engineers and computer scientists, 36-218 is a more mathematically rigorous class for Computer Science students and more mathematically advanced (students need advisor approval to enroll), and 21-325 is a rigorous Probability Theory course offered by the Department of Mathematics.) 36-326 is not offered every semester/year but can be substituted for 36-226 and is considered an honors course.

**It is possible to substitute 36-226 or 36-326 (honors course) for 36-23636-236 is the standard (and recommended) introduction to statistical inference.

Please note that students who complete 36-235 are expected to take 36-236 to fulfill their theory requirements. Students who choose to take 36-225 instead will be required to take 36-226 afterward, they will not be eligible to take 36-236.

Comments:

(i) In order to be in good standing and to continue with the minor, a grade of at least a C is required in 36-235  (or equivalent), 36-236  (or equivalent), and 36-401

Total number of units required for the minor83 Units

Double Counting

With respect to double-counting courses, it is departmental policy that students must have at least three statistics courses (36-xxx) that do not count for their primary major. If students do not have at least three, they need to take additional advanced electives. Make sure to consult your Statistics Minor advisor regarding double counting. 

Sample Programs for the Minor

The following two sample programs illustrates two (of many) ways to satisfy the requirements of the Statistics Minor. Keep in mind that the program is flexible and can support many other possible schedules.

The first schedule uses calculus sequence 1, and 36-202 to satisfy the intermediate data analysis requirement. The second schedule is an example of the case when a student enters the Minor through 36-235 and 36-236 (and therefore skips the beginning data analysis course). The schedule uses calculus sequence 2, and an advanced data analysis elective (to replace the beginning data analysis course).

Schedule 1

First-YearSecond-Year
FallSpringFallSpring
21-111 Calculus I21-112 Calculus II36-202 Methods for Statistics & Data Science21-240 Matrix Algebra with Applications
36-200 Reasoning with Data21-256 Multivariate Analysis

Third-YearFourth-Year
FallSpringFallSpring
36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II36-401 Modern RegressionAny 36-4xx level course

Schedule 2

First-YearSecond-Year
FallSpringFallSpring
21-120 Differential and Integral Calculus21-256 Multivariate Analysis36-235 Probability and Statistical Inference I36-236 Probability and Statistical Inference II
36-200 Reasoning with Data21-240 Matrix Algebra with Applications

Third-YearFourth-Year
FallSpringFall
36-401 Modern Regression36-3xx or 36-4xx Advanced Data Analysis ElectiveOne 36-4xx Advanced Methodology Course

Statistics & Data Science Dietrich Senior Honors Thesis

Eligibility

Eligibility is determined by Dietrich College. Students who are eligible will be notified prior to their senior year.

Dietrich College Requirements:

  • Students must have a major in Dietrich College, either as a primary or an additional major; or be in the BHA program.
  • Cumulative QPA through the end of the junior year of at least 3.25 overall, and 3.50 in the Dietrich College major associated with the proposed project.
  • Departmental sponsorship in the form of an agreement by a faculty member to serve as advisor for the 2-semester/18 unit Honors project (graduate students may not serve as advisors; adjunct faculty may do so, but only in collaboration with a regular faculty member), and approval by the department head.

Statistics & Data Science Requirements Overview

The below guidelines apply to any Statistics & Data Science students who are doing an honors thesis that has been approved through the Statistics & Data Science department (i.e. our department signs off on the thesis paperwork). If you are a Stat & DS student pursuing a Dietrich senior honors thesis through another department (i.e. a different department than Stat & DS is signing off on it) then these guidelines do not apply to you.

In order to be approved for a thesis with the Stat & DS department the project needs to have a significant statistical component. This will be discussed and confirmed during the proposal approval phase of applying. 

Honors Thesis Timeline

Senior Year - Fall Semester

The Dietrich College senior honors thesis is a year-long project. As such, after the fall semester of a student’s senior year a progress report will be due to Undergraduate Program Director, Peter Freeman, for review.

Progress Paper Requirements:

  • Minimum length - 5 pages of text (not including graphs/figures/results)
  • This paper should build substantially on the proposal, and lay out what work has been done up to this point, as well as an action plan for the spring semester. 
  • Must be sent to Undergraduate Program Director, Peter Freeman, by the last day of classes for the fall semester (typically the first week of December).

Senior Year - Spring Semester

Final Thesis Requirements:

In alignment with a typical advanced data analysis (ADA) project in the field of Statistics the minimum required length of the final thesis must be a minimum of 15 written pages, no more than 18 single-spaced pages, 12-point font. This does *not* include figures.

  • Figures can be embedded within the text (so long as the overall text length requirement is met) but can also be provided as appendices after the main body of the text.
  • Reports should be written in IMRaD format (Introduction, Methods, Results, and Discussion), where the "Introduction" can be a Background and Significance section followed by a Data section.
  • All theses are due to the Undergraduate Program Director, Peter Freeman, and Department Head, Rebecca Nugent, at the end of the 12th week of class in spring semester (roughly the first week of April). 
 

Substitutions and Waivers

Many departments require Statistics & Data Science courses as part of their major or minor programs. Students seeking transfer credit for those requirements from substitute courses (at Carnegie Mellon or elsewhere) should seek permission from their advisor in the department setting the requirement. The final authority in such decisions rests there. The Department of Statistics & Data Science does not provide approval or permission for substitution or waiver of another department's requirements.

However, the Statistics & Data Science department's Director of Undergraduate Studies can provide advice and information to the student's advisor about the viability of a proposed substitution. Students should make available as much information as possible concerning proposed substitutions. Students seeking waivers may be asked to demonstrate mastery of the material.

If a waiver or substitution is made in the home department, it is not automatically approved in the Department of Statistics & Data Science. In many of these cases, the student will need to take additional courses to satisfy the Statistics major requirements. Students should discuss this with a Statistics advisor when deciding whether to add an additional major in Statistics.

Statistics majors and minors seeking substitutions or waivers should speak to a departmental academic advisor.

Course Descriptions

About Course Numbers:

Each Carnegie Mellon course number begins with a two-digit prefix that designates the department offering the course (i.e., 76-xxx courses are offered by the Department of English). Although each department maintains its own course numbering practices, typically, the first digit after the prefix indicates the class level: xx-1xx courses are freshmen-level, xx-2xx courses are sophomore level, etc. Depending on the department, xx-6xx courses may be either undergraduate senior-level or graduate-level, and xx-7xx courses and higher are graduate-level. Consult the Schedule of Classes each semester for course offerings and for any necessary pre-requisites or co-requisites.


36-200 Reasoning with Data
All Semesters: 9 units
This course is an introduction to learning how to make statistical decisions and now to reason with data. The approach will emphasize the thinking-through of empirical problems from beginning to end and using statistical tools to look for evidence for/against explicit arguments/hypotheses. Types of data will include continuous and categorical variables, images, text, networks, and repeated measures over time. Applications will largely drawn from interdisciplinary case studies spanning the humanities, social sciences, and related fields. Methodological topics will include basic exploratory data analysis, elementary probability, significance tests, and empirical research methods. There will be once-weekly computer lab for additional hands-on practice using an interactive software platform that allows student-driven inquiry.
36-202 Methods for Statistics & Data Science
All Semesters: 9 units
This course builds on the principles and methods of statistical reasoning developed in 36-200 (or its equivalents). The course covers simple and multiple regression, basic analysis of variance methods, logistic regression, and introduction to data mining including classification and clustering. Students will also learn the principles of overfitting, training vs testing, ensemble methods, variable selection, and bootstrapping. Course objectives include applying the basic principles and methods that underlie statistical practice and empirical research to real data sets and interdisciplinary problems. Learning the Data Analysis Pipeline is strongly emphasized through structured coding and data analysis projects. In addition to three lectures a week, students attend a computer lab once a week for "hands-on" practice of the material covered in lecture. There is no programming language pre-requisite. Students will learn the basics of R Markdown and related analytics tools.
Prerequisites: 36-207 or 36-247 or 70-207 or 36-200 or 36-220
36-204 Discovering the Data Universe
Intermittent: 3 units
Every day we wake up in the data universe, we use the information around us to make decisions. We are constantly evaluating and interpreting data from our environment, in everything from spreadsheets to Instagram posts. At the same time, our own personal data are being observed and recorded and #8212;through websites we visit online, our smart devices, and even our interactions with other students and faculty at CMU. Navigating this data universe requires knowledge of what data is and how to use it responsibly. For example, can a plant be a data set? Discovering the truth behind a piece of data, including who made it, what it looks like, and what we can learn from it, is a critical skill. Understanding data can be the difference between being able to distinguish truth from lies; and the key to identifying your data footprint and succeeding in research and in your career. In this course, we will explore the data universe from multiple angles and across several types of data. We will define, find, and analyze data, and most importantly, identify narratives within data to tell stories about the world around us. We will examine data using the following questions: How can we tell multiple stories from the same dataset? What biases can exist in data? And, who creates or decides what data matters enough to collect, preserve, and share? NOTE: There will be one in person and one virtual pre-recorded lecture each week.
36-217 Probability Theory and Random Processes
All Semesters: 9 units
This course provides an introduction to probability theory. It is designed for students in electrical and computer engineering. Topics include elementary probability theory, conditional probability and independence, random variables, distribution functions, joint and conditional distributions, limit theorems, and an introduction to random processes. Some elementary ideas in spectral analysis and information theory will be given. A grade of C or better is required in order to use this course as a pre-requisite for 36-226 and 36-410. Not open to students who have received credit for 36-225, or 36-625.
Prerequisites: 21-256 or 21-122 or 21-123 or 21-259 or 21-112
Course Website: http://www.stat.cmu.edu/academics/courselist
36-218 Probability Theory for Computer Scientists
Fall and Spring: 9 units
Probability theory is the mathematical foundation for the study of both statistics and of random systems. This course is an intensive introduction to probability,from the foundations and mechanics to its application in statistical methods and modeling of random processes. Special topics and many examples are drawn from areas and problems that are of interest to computer scientists and that should prepare computer science students for the probabilistic and statistical ideas they encounter in downstream courses and research. A grade of C or better is required in order to use this course as a pre-requisite for 36-226, 36-326, and 36-410. If you hold a Statistics primary/additional major or minor you will be required to complete 36-226. For those who do not have a major or minor in Statistics, and receive at least a B in 36-218, you will be eligible to move directly onto 36-401.
Prerequisites: (21-112 and 21-111) or 21-120 or 21-256 or 21-259
Course Website: http://www.stat.cmu.edu/academics/courselist
36-219 Probability Theory and Random Processes
All Semesters: 9 units
This course provides an introduction to probability theory. It is designed for students in electrical and computer engineering. Topics include elementary probability theory, conditional probability and independence, random variables, distribution functions, joint and conditional distributions, limit theorems, and an introduction to random processes. Some elementary ideas in spectral analysis and information theory will be given. A grade of C or better is required in order to use this course as a pre-requisite for 36-226 and 36-410.
Prerequisites: (21-112 and 21-111) or 21-120 or 21-256 or 21-259
36-220 Engineering Statistics and Quality Control
Fall and Spring: 9 units
This is a course in introductory statistics for engineers with emphasis on modern product improvement techniques. Besides exploratory data analysis, basic probability, distribution theory and statistical inference, special topics include experimental design, regression, control charts and acceptance sampling.
Prerequisites: 21-112 or 21-120
36-225 Introduction to Probability Theory
Fall and Summer: 9 units
This course is the first half of a year-long course which provides an introduction to probability and mathematical statistics for students in the data sciences. Topics include elementary probability theory, conditional probability and independence, random variables, distribution functions, joint and conditional distributions, law of large numbers, and the central limit theorem.
Prerequisites: (21-112 and 21-111) or 21-120 or 21-256 or 21-259
Course Website: http://coursecatalog.web.cmu.edu/schools-colleges/dietrichcollegeofhumanitiesandsocialsciences/depar
36-226 Introduction to Statistical Inference
Spring and Summer: 9 units
This course is the second half of a year long course in probability and mathematical statistics. Topics include maximum likelihood estimation, confidence intervals, hypothesis testing, and properties of estimators, such as unbiasedness and consistency. If time permits there will also be a discussion of linear regression and the analysis of variance. A grade of C or better is required in order to advance to 36-401, 36-402 or any 36-46x course. Not open to students who have received credit for 36-626.
Prerequisites: 21-325 Min. grade C or 36-219 Min. grade C or 36-225 Min. grade C or 36-218 Min. grade C or 36-217 Min. grade C or 15-259 Min. grade C
36-235 Probability and Statistical Inference I
Fall: 9 units
This class is the first half of a two-semester, calculus-based course sequence that introduces theoretical aspects of probability and statistical inference to students. The material in this course and in 36-236 (Probability and Statistical Inference II) is organized so as to provide repeated exposure to essential concepts: the courses cover specific probability distributions and their inferential applications one after another, starting with the normal distribution and continuing with the binomial and Poisson distributions, etc. Topics specifically covered in 36-235 include basic probability, random variables, univariate and multivariate distribution functions, point and interval estimation, hypothesis testing, and regression, with the discussion being supplemented with computer-based examples and exercises (e.g., visualization and simulation). Given its organization, the course is only appropriate for those taking the full two-semester sequence, and thus it is currently open only to statistics majors (primary, additional, dual) and minors. (Check with the statistics advisors for the exact declaration deadline.) Non-majors/minors requiring a probability course are directed to take 36-225 or one of its analogues. A grade of C or better in 36-235 is required in order to advance to 36-236 (or 36-226) and/or 36-410. This course is not open to students who have received credit for 36-217, 36-218, 36-219, or 36-700, or for 21-325 or 15-259.
Prerequisites: (21-111 and 21-112) or 21-256 or 21-259 or 21-120
36-236 Probability and Statistical Inference II
Spring: 9 units
This class is the second half of a two-semester, calculus-based course sequence that introduces theoretical aspects of probability and statistical inference to students. The material in this course and in 36-235 (Probability and Statistical Inference I) is organized so as to provide repeated exposure to essential concepts: the courses cover specific probability distributions and their inferential applications one after another, starting with the normal distribution and continuing with the binomial and Poisson distributions, etc. Topics specifically covered in 36-236 include the binomial and related distributions, the Poisson and related distributions, and the uniform distribution, and how they are used in point and interval estimation, hypothesis testing, and regression. Also covered in 36-236 are topics related to multivariate distributions: marginal and conditional distributions, covariance, and conditional distribution moments. All discussion is supplemented with computer-based examples and exercises (e.g., visualization and simulation). Given its organization, the course is only appropriate for those who first take 36-235, and thus it is currently open only to statistics majors (primary, additional, dual) and minors, as well as to CS majors using both 36-235 and 36-236 to complete their probability requirement. All others are directed to take 36-226. A grade of C or better in 36-236 is required in order to advance to 36-401.
Prerequisite: 36-235 Min. grade C
36-247 Statistics for Lab Sciences
Fall and Spring: 9 units
This course is a single-semester comprehensive introduction to statistical analysis of data for students in biology and chemistry. Topics include exploratory data analysis, elements of computer programming for statistics, basic concepts of probability, statistical inference, and curve fitting. In addition to three lectures, students attend a computer lab each week. Not open to students who have received credit for 36-200, 36-207/70-207, 36-220, or 36-226.
36-290 Introduction to Statistical Research Methodology
Fall: 9 units
This is a first course in statistical practice, targeted to first-semester sophomores. It is designed as a high-level introduction to the ways by which statisticians go about approaching and analyzing quantitative observational data, thus preparing students for future work in capstone classes. Students in the course are taught the basic concepts of statistical learning and #8212;inference vs.prediction, supervised vs. unsupervised learning, regression vs. classification, etc. and #8212;and will reinforce this knowledge by applying, e.g., linear regression, random forest, principal components analysis, and/or hierarchical clustering and more to datasets provided by the instructor. Students will also practice disseminating the results of their analyses via oral presentations and posters. Analyses will be carried out using the R programming language.
Prerequisites: 36-200 or 36-247 or 70-207 or 36-220 or 36-207

Course Website: http://coursecatalog.web.cmu.edu/schools-colleges/dietrichcollegeofhumanitiesandsocialsciences/depar
36-297 Early Undergraduate Research
Fall and Spring: 6 units
This course is designed to give early undergraduate students (those who have not yet taken 36-401) experience navigating real data science research problems. Small groups of students are matched with clients and do supervised research for a semester. From an academic perspective, the course presents an opportunity for students to gain skills in, e.g., data acquisition and cleaning, exploratory data analysis, and basic statistical modeling; which skills are practiced is project-dependent. Additionally, the course will help students develop the professional skills necessary for successfully navigating team-based project delivery roles. Programming will be performed in R and/or Python; previous programming experience is not required.
36-300 Statistics & Data Science Internship
Summer: 3 units
The Department of Statistics and amp; Data Science considers experiential learning as an integral part of our program. One such option is through an internship. If a student has an internship, they dont have to register for this class unless they want it listed on their official transcripts. This process should be used by international students interested in Curricular Practical Training (CPT) and should also be authorized by the Office of International Education (OIE). More information regarding CPT is available on OIE's website. This course will be taken as Pass/Fail, and students will be charged tuition for 3 units. There is an approval process in order to register for this course. Please contact your advisor the Department of Statistics and amp; Data Science for more details.
36-301 Documenting Human Rights
Intermittent: 9 units
This course will teach students about the origins of modern human rights and the evolution of methods to document the extent to which these rights are being upheld or violated. The need to understand and document human rights issues is at the center of the most pressing current events. From threats to democracy and civil rights to work holding perpetrators of mass harm accountable in legal proceedings to efforts to quantify and advance economic, social, cultural, and environmental rights, making human rights violations visible is fundamental to achieving a more just world. We will begin with an overview of the history of human rights, the main philosophical and political debates in the field, and the most relevant organizations, institutions, and agreements. We will then delve into specific cases that highlight methodological opportunities and challenges, including: the identification of mass atrocity victims, the disappeared, and missing migrants; efforts to estimate civilian casualties in war; the documentation of police brutality and other human rights violations with smartphones; as well as the use of satellite imagery and drone footage for the documentation of genocide, environmental rights, and war crimes. We will critically assess the technical challenges that arise in each context and how the human rights and scientific communities have responded. After reviewing these cases, we will conclude by reflection on why the documentation of human rights actually matters and what happens to evidence once it is gathered. Students will then take what they've learned and do two multidisciplinary group projects, one involving the document of a rights violation in Western Pennsylvania and the other involving an international situation. Assignments include an essay, a data analysis assignment, and a group project that include a written component, quantitative and/or qualitative data analysis, and a presentation.
36-303 Sampling, Survey and Society
Spring: 9 units
This course will revolve around the role of sampling and sample surveys in the context of U.S. society and its institutions. We will examine the evolution of survey taking in the United States in the context of its economic, social and political uses. This will eventually lead to discussions about the accuracy and relevance of survey responses, especially in light of various kinds of nonsampling error. Students will be required to design, implement and analyze a survey sample.
Prerequisites: 36-208 or 36-202 or 36-309 or 36-220 or 36-226 or 36-326 or 70-208 or 36-236 or 36-218 Min. grade B
36-309 Experimental Design for Behavioral & Social Sciences
Fall and Summer: 9 units
This course focuses on the statistical aspects of the design and analysis stages of planned experiments. The design stage focuses on determining how experimental factors are allocated, the sample size necessary to achieve adequate statistical power, and how subjects/variables are measured. The analysis stage focuses on how data are collected and which statistical models are most appropriate to answer the research questions of interest. Although students will have to do some computer programming to implement these statistical techniques, the most important aspect of the course will be on interpreting analyses' results (e.g., whether a given analysis is appropriate, to what extent that analysis can answer research questions of interest, and the broader implications of an analysis within the context of the experiment). In addition to a weekly lecture, students will attend a computer lab once a week to get guidance and hands-on practice implementing statistical techniques we learn in class.
Prerequisites: 15-260 or 36-220 or 36-200 or 70-207 or 36-247 or 36-218 or 36-236 or 36-226 or 36-326
Course Website: http://www.stat.cmu.edu/academics/courselist
36-311 Statistical Analysis of Networks
Intermittent: 9 units
Networks are omnipresent. In this course, students will get an introduction to network science, mainly focusing on social network analysis. The course will start with some empirical background, and an overview of concepts used when measuring and describing networks. We will also discuss network visualization. Most traditional models cannot be applied straightforwardly to social network data, because of their complex dependence structure. We will discuss random graph models and statistical network models, that have been developed for the study of network structure and growth. We will also cover models of how networks impact individual behavior.
Prerequisite: 36-226
36-313 Statistics of Inequality and Discrimination
Intermittent: 9 units
Many social questions about inequality, injustice and unfairness are, in part, questions about evidence, data, and statistics. This class lays out the statistical methods which let us answer questions like "Does this employer discriminate against members of that group?", "Is this standardized test biased against that group?", "Is this decision-making algorithm biased, and what does that even mean?" and "Did this policy which was supposed to reduce this inequality actually help?" We will also look at inequality within groups, and at different ideas about how to explain inequalities between groups. The class will interweave discussion of concrete social issues with the relevant statistical concepts.
Prerequisite: 36-202
36-315 Statistical Graphics and Visualization
All Semesters: 9 units
Graphical displays of quantitative information take on many forms, and they help us understand data and statistical methods by (hopefully) clearly communicating arguments, results, and ideas. This course introduces students to the most common forms of graphical displays and their uses and misuses. Ideally, graphs are designed according to three key elements: The data structure, the graph's audience, and the designer's intended message. Students will learn how to create well-designed graphs and understand them from a statistical perspective. Furthermore, the course will consider complex data structures that are becoming increasingly common in data visualizations (temporal, spatial, and text data); we will discuss common ways to process these data that make them easy to visualize. As time permits, we may also consider more advanced graphical methods (e.g., interactive graphics and computer-generated animations). In addition to two weekly lectures, there will be weekly computer labs and homework assignments where students use R to visualize and analyze real datasets. Along the way, students also make monthly Piazza posts discussing the strengths and weaknesses of a graph they found online, thereby critiquing real graphical designs found in the wild. The course culminates in a group final project, where students make public-facing data visualizations and analyses for a real dataset. All assignments will be in R; although this is not a programming class, using programming-based statistical software like R is essential to create modern-day graphics, and this class will give you practice using this kind of software. Throughout, communication skills (usually written or visual, but sometimes spoken) will play an important role. Indeed, if it's true that "a picture speaks a thousand words," then ideally the one thousand words you are communicating with your graphics are statistically correct, clear, and compelling.
Prerequisites: 36-225 or 36-309 or 15-259 or 36-202 or 36-235 or 36-208 or 36-219 or 36-218 or 70-208 or 21-325
36-318 Introduction to Causal Inference
Intermittent: 9 units
Many social science and scientific inquiries can be framed as causal questions. Does a new cancer treatment cause a reduction in mortality? Do financial grants cause students to do better in college? Does a new public policy cause an increase in voter turnout? When tackling these questions, we frequently come across the phrase "correlation does not imply causation." If that's the case, then what does imply causation? In this course, we will discuss causal inference methods for measuring causal effects of different interventions (e.g., drug treatments, financial grants, and public policies). First, we will discuss how experiments and #8212;-where interventions are randomized among subjects and #8212;-can imply causation when an appropriate experimental design and statistical analysis is used. Then, we will discuss how observational studies and #8212;-where interventions are not randomized and #8212;-can also imply causation when approaches like propensity score methods, matching, and doubly robust estimation are employed. Finally, we will discuss instrumental variables and regression discontinuity designs and #8212;-which are frequently used in medicine and public policy for establishing causal inferences. Throughout we will use R to conduct causal analyses. A working knowledge of regression is encouraged, but regression will also be discussed and taught during much of the course.
Prerequisites: 36-219 Min. grade C or 36-225 Min. grade C or 36-235 Min. grade C or 36-218 Min. grade C or 15-259 Min. grade C or 21-325 Min. grade C
36-326 Mathematical Statistics (Honors)
Spring: 9 units
This course is a rigorous introduction to the mathematical theory of statistics. A good working knowledge of calculus and probability theory is required. Topics include maximum likelihood estimation, confidence intervals, hypothesis testing, Bayesian methods, and regression. A grade of C or better is required in order to advance to 36-401, 36-402 or any 36-46x course. Not open to students who have received credit for 36-625. Prerequisites: 15-359 or 21-325 or 36-217 or 36-225 with a grade of A AND advisor approval. Students interested in the course should add themselves to the waitlist pending review.
Prerequisites: 15-359 Min. grade A or 36-225 Min. grade A or 21-325 Min. grade A or 36-217 Min. grade A or 36-218 Min. grade A
36-350 Statistical Computing
All Semesters: 9 units
Statistical Computing is a one-semester course that will introduce you to the fundamentals of computational data analysis, as carried out in the R programming language, and to the fundamentals of working with relational databases, such as SQLite. No previous knowledge of either is required.
Prerequisites: 36-235 Min. grade C or 15-259 Min. grade C or 36-217 Min. grade C or 36-219 Min. grade C or 21-325 Min. grade C or 36-218 Min. grade C or 36-225 Min. grade C
36-400 Introduction to Statistical Modeling and Learning
Spring: 9 units
This course is a high-level introduction both to fundamental concepts of probability and statistics and to the ways by which statisticians go about approaching and analyzing data. The course will cover data processing, exploratory data analysis, parameter estimation and hypothesis testing, clustering, and common regression and classification models. Students will carry out work using the R and Python programming languages. This course is open only to students completing the Data Science in Society minor.
36-401 Modern Regression
Fall: 9 units
This course is an introduction to the real world of statistics and data analysis using linear regression modeling. We will explore real data sets, examine various models for the data, assess the validity of their assumptions, and determine which conclusions we can make (if any). We will use the R programming language to implement our analyses and produce graphs and tables of results. Data analysis is a bit of an art; there may be several valid approaches. We will strongly emphasize the importance of critical thinking about the data and the question of interest. Our overall goal is to use data and a basic set of modeling tools to answer substantive questions, and to present the results in a scientific report.
Prerequisites: (36-236 Min. grade C or 36-326 Min. grade C or 36-226 Min. grade C or 36-218 Min. grade B) and (21-241 or 21-240 or 21-242)
36-402 Advanced Methods for Data Analysis
Spring: 9 units
This course introduces modern methods of data analysis, building on the theory and application of linear models from 36-401. Topics include nonlinear regression, nonparametric smoothing, density estimation, generalized linear and generalized additive models, simulation and predictive model-checking, cross-validation, bootstrap uncertainty estimation, multivariate methods including factor analysis and mixture models, and graphical models and causal inference. Students will analyze real-world data from a range of fields, coding small programs and writing reports.
Prerequisite: 36-401 Min. grade C
36-410 Introduction to Probability Modeling
Spring: 9 units
An introductory-level course in stochastic processes. Topics typically include Poisson processes, Markov chains, birth and death processes, random walks, recurrent events, and renewal theory. Examples are drawn from reliability theory, queuing theory, inventory theory, and various applications in the social and physical sciences.
Prerequisites: 36-225 or 36-217 or 21-325 or 36-235
36-460 Special Topics: Sports Analytics
Spring: 9 units
This course introduces students to fundamental topics in sports analytics and the relevant statistical methods for tackling problems in this growing area. The first half of the course will cover foundational topics in sports analytics including models for the expected value of game states, win probability, team ratings, and hierarchical models for player evaluation. The second half of the course will focus on spatio-temporal methods appropriate for modeling complex player-tracking data. The focus is on understanding the foundations of the considered methods and introducing software for implementation. Students will develop their own sports analytics project using techniques covered in the course for their final assessment.
Prerequisite: 36-401 Min. grade C
36-461 Special Topics: Statistical Methods in Epidemiology
Intermittent: 9 units
Epidemiology is concerned with understanding factors that cause, prevent, and reduce diseases by studying associations between disease outcomes and their suspected determinants in human populations. Epidemiologic research requires an understanding of statistical methods and design. Epidemiologic data is typically discrete, i.e., data that arise whenever counts are made instead of measurements. In this course, methods for the analysis of categorical data are discussed with the purpose of learning how to apply them to data. The central statistical themes are building models, assessing fit and interpreting results. There is a special emphasis on generating and evaluating evidence from observational studies. Case studies and examples will be primarily from the public health sciences.
Prerequisite: 36-401 Min. grade C

Course Website: http://coursecatalog.web.cmu.edu/schools-colleges/dietrichcollegeofhumanitiesandsocialsciences/depar
36-462 Special Topics: Methods of Statistical Learning
Intermittent: 9 units
Data mining is the science of discovering patterns and learning structure in large data sets. Covered topics include information retrieval, clustering, dimension reduction, regression, classification, and decision trees.
Prerequisite: 36-401 Min. grade C

Course Website: http://www.stat.cmu.edu/academics/courselist
36-463 Special Topics: Multilevel and Hierarchical Models
Intermittent: 9 units
Multilevel and hierarchical models are among the most broadly applied "sophisticated" statistical models, especially in the social and biological sciences. They apply to situations in which the data "cluster" naturally into groups of units that are more related to each other than they are the rest of the data. In the first part of the course we will review linear and generalized linear models. In the second part we will see how to generalize these to multilevel and hierarchical models and relate them to other areas of statistics, and in the third part of the course we will learn how Bayesian statistical methods can help us to build, estimate and diagnose problems with these models using a variety of data sets and examples.
Prerequisite: 36-401 Min. grade C

Course Website: http://www.stat.cmu.edu/academics/courselist
36-464 Special Topics: Psychometrics: A Statistical Modeling Approach
Intermittent: 9 units
Much of the social, educational, policy, and professional worlds involve measuring the skills, abilities, attitudes, decision-making, etc. of people and #8212; from SAT's and GRE's for school, to 360-evaluations in business. This is the field of modern psychometrics, and it involves (at least) two kinds of craft: designing good sets of questions, and designing and fitting statistical models that extract the information we want from the responses to those questions. In this course we will touch on both kinds of craft, but we will concentrate on the second: what do statistical models for psychometric data look like, and how can we design, fit, and use them in practice? We will look at these models from a variety of statistical perspectives, but we will concentrate on the applied Bayesian point of view.
Prerequisite: 36-401 Min. grade C

Course Website: http://www.stat.cmu.edu/academics/courselist
36-465 Special Topics: Conceptual Foundations of Statistical Learning
Intermittent: 9 units
This class is an introduction to the foundations of statistical learning theory, and its uses in designing and analyzing machine-learning systems. Statistical learning theory studies how to fit predictive models to training data, usually by solving an optimization problem, in such a way that the model will predict well, on average, on new data. The course will focus on the key concepts and theoretical tools, at a level accessible to students who have taken 36-401 and its pre-requisites. The course will also illustrate those concepts and tools by applying them to carefully selected kinds of machine learning systems (such as kernel machines). Students wanting exposure to a broad range of algorithms and applications would be better served by 36-462/662 ("Data Mining"). This class is for those who want a deeper understanding of the principles underlying all machine learning methods.
Prerequisite: 36-401 Min. grade C
36-466 Special Topics: Statistical Methods in Finance
Intermittent: 9 units
Financial econometrics is the interdisciplinary area where we use statistical methods and economic theory to address a wide variety of quantitative problems in finance. These include building financial models, testing financial economics theory, simulating financial systems, volatility estimation, risk management, capital asset pricing, derivative pricing, portfolio allocation, proprietary trading, portfolio and derivative hedging, and so on and so forth. Financial econometrics is an active field of integration of finance, economics, probability, statistics, and applied mathematics. Financial activities generate many new problems and products, economics provides useful theoretical foundation and guidance, and quantitative methods such as statistics, probability and applied mathematics are essential tools to solve quantitative problems in finance. Professionals in finance now routinely use sophisticated statistical techniques and modern computation power in portfolio management, proprietary trading, derivative pricing, financial consulting, securities regulation, and risk management.
Prerequisite: 36-401
36-467 Special Topics: Data over Space & Time
Intermittent: 9 units
This course is an introduction to the opportunities and challenges of analyzing data from processes unfolding over space and time. It will cover basic descriptive statistics for spatial and temporal patterns; linear methods for interpolating, extrapolating, and smoothing spatio-temporal data; basic nonlinear modeling; and statistical inference with dependent observations. Class work will combine practical exercises in R, a little mathematics on the underlying theory, and case studies analyzing real problems from various fields (economics, history, meteorology, ecology, etc.). Depending on available time and class interest, additional topics may include: statistics of Markov and hidden-Markov (state-space) models; statistics of point processes; simulation and simulation-based inference; agent-based modeling; dynamical systems theory.
Prerequisite: 36-401 Min. grade C

Course Website: http://coursecatalog.web.cmu.edu/schools-colleges/dietrichcollegeofhumanitiesandsocialsciences/depar
36-468 Special Topics: Text Analysis
Intermittent: 9 units
The analysis of language is concerned with how variables relate to people (their gender, age, and location, for example), how variables relate to use (such as writing in different academic disciplines), and how variables change over time. While we are surrounded by data that might potentially shed light on many of these questions, working with real-world linguistic data can present some unique challenges in sampling, in the distribution of features, and in their high dimensionality. In this course, we work through some of these issues, paying particular attention to the aligning of the statistical questions we want to investigate with the choice of statistical models, as well as focusing on the interpretation of results. Analysis will be carried out in R and students will develop a suite of tools as they work through their course projects.
Prerequisites: 36-218 Min. grade B or 36-236 Min. grade C or 36-226 Min. grade C
36-469 Special Topics: Statistical Genomics and High Dimensional Inference
Intermittent: 9 units
The field of computational and statistical genomics focuses on developing and applying computationally efficient and statistically robust methods to sort through increasingly rich and massive genome wide data sets to identify complex genetic patterns, gene interactions, and disease associations. Because the genome is vast, analytical approaches require high dimensional statistical approaches such as multiple testing, dimension reduction techniques, regularization and high dimensional regression analysis, best linear unbiased prediction models, networks and graphical models. In this course, we will motivate these topics using data obtained from the human genetic and genomic literature. No prior knowledge in biology is required.
Prerequisite: 36-401 Min. grade C
36-471 Special Topics: Networks
Fall: 9 units
TBD
Prerequisite: 36-401 Min. grade C
36-490 Undergraduate Research
Fall and Spring: 9 units
This course is designed to give undergraduate students experience using statistics in real research problems. Small groups of students are matched with clients and do supervised research for a semester. From an academic perspective, the course presents an opportunity for students to gain skills in approaching a research problem, critical thinking, and statistical analyses. Additionally, the course will help students develop the professional skills necessary for successfully navigating team-based project delivery roles. Client-facing and collaborative skills will be emphasized within a team setting, and students will learn leading practices for engaging stakeholders as well as gain a conceptual understanding of leading practices for project delivery.
36-493 Sports Analytics Capstone
Intermittent: 9 units
This course is designed to give undergraduate students experience applying statistics and amp; data science methodology to research problems in sports analytics. Small groups of students will be matched with clients in the Carnegie Mellon Athletics Department and do supervised projects for a semester. Students will gain skills in approaching a real world problem, critical thinking, advanced statistical analysis, scientific writing, collaboration with clients, communicating results, and meeting expectations with respect to deliverables and timelines. The projects will change and rotate each semester. The course size is limited, and students will submit an application including their project preferences. Students with skill sets matching project needs will be given priority. We will also take into consideration whether or not a student has had a recent prior data science experience with the goal of providing experiences to a broad group of qualified students. Students do not need to be experts in sports analytics or have extensive knowledge in sports.
36-497 Corporate Capstone Project
Fall and Spring: 9 units
This course is designed to give undergraduate students experience applying statistics data science methodology to real industry projects. Small groups of students will be matched with industry clients and do supervised projects for a semester. From an academic perspective, the course presents an opportunity for students to gain skills in approaching a research problem, critical thinking, and statistical analyses. Additionally, the course will help students develop the professional skills necessary for successfully navigating team-based project delivery roles. Client-facing and collaborative skills will be emphasized within a team setting, and students will learn leading practices for engaging stakeholders as well as gain a conceptual understanding of leading practices for project delivery. The industry clients will change and rotate each semester; available projects will be advertised prior to the first week of class. The course size is limited; students apply the previous semester and placed on the course waitlist until project matching is performed. Students with skill sets matching project needs will be given priority. We will also take into consideration whether or not a student has had a recent prior corporate capstone experience with the goal of providing experiences to a broad group of qualified students. Note that there is no guarantee a waitlisted student will be matched to a project in any given semester.
36-498 Corporate Capstone II
Fall and Spring
This course allows students to continue work on projects begun as part of 36-497, Corporate Capstone Project. Enrollment is at the discretion of the external advisor for the 36-497 project and the Department of Statistics and amp; Data Science.
36-700 Probability and Mathematical Statistics
Fall: 12 units
This is a one-semester course covering the basics of statistics. We will first provide a quick introduction to probability theory, and then cover fundamental topics in mathematical statistics such as point estimation, hypothesis testing, asymptotic theory, and Bayesian inference. If time permits, we will also cover more advanced and useful topics including nonparametric inference, regression and classification. Prerequisites: one- and two-variable calculus and matrix algebra. Graduate students in degree-seeking programs are given priority.

Faculty

SIVARAMAN BALAKRISHNAN, Associate Professor – Ph.D., Carnegie Mellon; Carnegie Mellon, 2015–

ELI BEN-MICHAEL, Assistant Professor, Joint With Heinz College – Ph.D., University of California; Carnegie Mellon, 2022–

ZACHARY BRANSON, Assistant Teaching Professor – Ph.D., Harvard University; Carnegie Mellon, 2019–

DAVID CHOI, Associate Professor of Statistics and Information Systems – Ph.D., Stanford University; Carnegie Mellon, 2004–

ALEXANDRA CHOULDECHOVA, Estella Loomis McCandless Assistant Professor of Statistics and Public Policy – Ph.D. , Stanford University; Carnegie Mellon, 2014–

REBECCA DOERGE, Dean of Mellon College of Science, Professor of Statistics – Ph.D., North Carolina State University; Carnegie Mellon, 2016–

PETER E. FREEMAN, Associate Teaching Professor; Director of Undergraduate Studies – Ph.D. , University of Chicago; Carnegie Mellon, 2004–

CHRISTOPHER R. GENOVESE, Professor – Ph.D., University of California; Carnegie Mellon, 1994–

JOEL B. GREENHOUSE, Professor – Ph.D., University of Michigan; Carnegie Mellon, 1982–

AMELIA HAVILAND, Anna Loomis McCandless Professor of Statistics and Public Policy – Ph.D., Carnegie Mellon University; Carnegie Mellon, 2003–

JIASHUN JIN, Professor – Ph.D., Stanford University; Carnegie Mellon, 2007–

ROBERT E. KASS, Maurice Falk Professor of Statistics & Computational Neuroscience – Ph.D., University of Chicago; Carnegie Mellon, 1981–

EDWARD KENNEDY, Associate Professor – Ph.D., University of Pennsylvania; Carnegie Mellon, 2016–

ARUN KUCHIBHOTLA, Assistant Professor – Ph.D., University of Pennsylvania; Carnegie Mellon, 2020–

MIKAEL KUUSELA, Assistant Professor – Ph.D., Ecole Polytechnique Federale de Lausanne; Carnegie Mellon, 2018–

ANN LEE, Professor, Co-Director of PhD program – Ph.D., Brown University; Carnegie Mellon, 2005–

JING LEI, Professor – Ph.D., University of California; Carnegie Mellon, 2011–

ROBIN MEJIA, Assistant Research Professor – Ph.D., University of California; Carnegie Mellon, 2018–

GONZALO E. MENA, Assistant Professor – Ph.D., Columbia University; Carnegie Mellon, 2023–

DANIEL NAGIN, Teresa and H. John Heinz III Professor of Public Policy – Ph.D., Carnegie Mellon University; Carnegie Mellon, 1976–

MATEY NEYKOV, Associate Professor – Ph.D., Harvard University; Carnegie Mellon, 2017–

NYNKE NIEZINK, Assistant Professor – Ph.D., University of Groningen; Carnegie Mellon, 2017–

REBECCA NUGENT, Department Head, Stephen E. and Joyce Fienberg Professor of Statistics & Data Science – Ph.D., University of Washington; Carnegie Mellon, 2006–

AADITYA RAMDAS, Assistant Professor – Ph.D., Carnegie Mellon; Carnegie Mellon, 2018–

ALEX REINHART, Assistant Teaching Faculty – Ph.D., Carnegie Mellon University; Carnegie Mellon, 2018–

KATHRYN ROEDER, UPMC Professor of Statistics and Life Sciences – Ph.D., Pennsylvania State University; Carnegie Mellon, 1994–

CHAD M. SCHAFER, Professor – Ph.D., University of California, Berkeley; Carnegie Mellon, 2004–

TEDDY SEIDENFELD, Herbert A. Simon Professor of Philosophy and Statistics – Ph.D., Columbia University; Carnegie Mellon, 1985–

COSMA SHALIZI, Associate Professor – Ph.D., University of Wisconsin, Madison; Carnegie Mellon, 2005–

WEIJING TANG, Assistant Professor – Ph.D., University of Michigan; Carnegie Mellon, 2023–

WILL TOWNES, Assistant Professor – Ph.D., Harvard University; Carnegie Mellon, 2022–

VALERIE VENTURA, Professor, Co-Director of PhD program – Ph.D., University of Oxford; Carnegie Mellon, 1997–

ISABELLA VERDINELLI, Professor in Residence – Ph.D., Carnegie Mellon University; Carnegie Mellon, 1991–

LARRY WASSERMAN, UPMC Professor of Statistics – Ph.D., University of Toronto; Carnegie Mellon, 1988–

RON YURKO, Assistant Teaching Professor – Ph.D., Carnegie Mellon; Carnegie Mellon, 2022–

Emeriti Faculty

GEORGE T. DUNCAN, Professor of Statistics and Public Policy – Ph.D., University of Minnesota; Carnegie Mellon, 1974–

WILLIAM F. EDDY, John C. Warner Professor of Statistics – Ph.D, Yale University; Carnegie Mellon, 1976–

BRIAN JUNKER, Professor – Ph.D., University of Illinois; Carnegie Mellon, 1990–

JOSEPH B. KADANE, Leonard J. Savage Professor of Statistics and Social Sciences – Ph.D., Stanford University; Carnegie Mellon, 1969–

JOHN P. LEHOCZKY, Thomas Lord Professor of Statistics – Ph.D, Stanford; Carnegie Mellon, 1969–

MARK J. SCHERVISH, Professor – Ph.D., University of Illinois; Carnegie Mellon, 1979–

DALENE STANGL, Teaching Professor – Ph.D., Carnegie Mellon University; Carnegie Mellon, 2017–

Special Faculty

PHILIPP BURCKHARDT, Director of e-Learning, Analytics, and Technology – Ph.D., Carnegie Mellon; Carnegie Mellon, 2022–

F. SPENCER KOERNER, Lecturer – Ph.D., Carnegie Mellon; Carnegie Mellon, 2022–

JAMIE MCGOVERN, Director: Master of Statistical Practice Program – B.A., Rice University; Carnegie Mellon, 2020–

GORDON WEINBERG, Senior Lecturer – M.A., University of Pittsburgh; Carnegie Mellon, 2004–

Affiliated Faculty

ANTHONY BROCKWELL – Ph.D., Melbourne University; Carnegie Mellon, 1999–

BERNIE DEVLIN – Ph.D., Pennsylvania State University; Carnegie Mellon, 1994–

TAEYONG PARK, Assistant Teaching Professor – Ph.D., Washington University in St. Louis; Carnegie Mellon, 2018–

ALESSANDRO RINALDO, Professor – Ph.D., Carnegie Mellon; Carnegie Mellon, 2005–

SAM VENTURA – Ph.D., Carnegie Mellon University; Carnegie Mellon, 2015–

Back to top