Department Head: Robert F. Murphy, PhD (GHC 7725)
Assistant Dept. Head for Education: Phillip Compeau, PhD (GHC 7403)
Academic Program Manager: Nicole Stenger (GHC 7414)
http://cbd.cmu.edu

Bachelor of Science in Computational Biology

Program Director: Dr. Phillip Compeau
Program Manager: Nicole Stenger

Success in computational biology requires significant technical knowledge of fundamental computer science as well as a broad biological intuition and general understanding of experimental biology.  However, most importantly, it requires students who can integrate their knowledge by making connections between the two fields.

There is significant industry demand for excellent computational biology students, in biotech firms, biomedical research, as well as in pharmaceutical research.  Both established companies and startups struggle to find employees with the correct skillset, and our students will be able to take advantage of the fact that an undergraduate computational biology major has the rigorous training required to handle the challenges of modern research that is not provided by any of our peer institutions.

Students completing the undergraduate program in computational biology will also be ideally prepared for Ph.D. programs in any of a range of biomedical areas, including Computational Biology, Systems Biology, or Quantitative Biology. Students who complete pre-medical requirements will be very well-prepared to attend medical school; after all, the next generation of physicians will need to better understand the computational approaches needed for automated medical testing, automated medical imaging, and the coming personalized medicine revolution.

Degree Requirements (students entering Fall 2017)

Students completing the Bachelor of Science in Computational Biology follow certain policies that apply to all SCS students; please consult the SCS policies page for a complete listing of these expectations.

Students must complete a minimum of 360 units for the degree in computational biology.

For Mellon College of Sciences students interested in computational biology who matriculated at Carnegie Mellon before fall 2017, please go to Previous Catalogs for degree requirements.

Math/Stats Core

21-120Differential and Integral Calculus10
21-122Integration and Approximation10
15-151Mathematical Foundations for Computer Science
(or 21-127 if not offered)
10
36-217Probability Theory and Random Processes
(does not require 21-259 as a prerequisite)
9-12
or 15-359 Probability and Computing
or 36-225 Introduction to Probability Theory
or 21-325 Probability
36-226Introduction to Statistical Inference
(Students taking 15-359 should take 36-326 instead.)
9
or 36-326 Mathematical Statistics (Honors)
Total Units48-51

General Science Core

09-105Introduction to Modern Chemistry I10
or 09-107 Honors Chemistry: Fundamentals, Concepts and Applications
33-121Physics I for Science Students12
or 33-141 Physics I for Engineering Students
Total Units22

Biological Core

03-121Modern Biology9
03-220Genetics9
03-232Biochemistry I
(Students taking 03-231, including pre-med students, will take organic chemistry as a prerequisite, which will satisfy a biology elective requirement.)
9
or 03-231 Biochemistry I
03-320Cell Biology9
Total Units36

Computer Science Core 

15-128Freshman Immigration Course
(This course may be replaced by 03-201 or 03-202 if and only if 15-128 is not offered)
1
99-101Computing @ Carnegie Mellon3
or 99-102 Computing @ Carnegie Mellon
15-122Principles of Imperative Computation10
15-251Great Ideas in Theoretical Computer Science12
15-351Algorithms and Advanced Data Structures
(Students taking 15-150 and 15-210 as prerequisites for 15-451 may apply these courses as CS electives.)
12
or 15-451 Algorithm Design and Analysis
10-401Introduction to Machine Learning (Undergrad)12
Total Units50

Computational Biology Core 

02-250Introduction to Computational Biology12
02-261Quantitative Cell and Molecular Biology Laboratory9-12
or 03-343 Experimental Techniques in Molecular Biology
02-402Computational Biology Seminar3
02-510Computational Genomics12
02-512Computational Methods for Biological Modeling and Simulation9-12
or 02-530 Cell and Systems Modeling
Total Units45-51

Major Electives 

02-3xxComputational Biology Electives at 300 level or above (Includes a few courses outside of 02-xxx, such as 03-500 if research is computational; list of acceptable courses updated annually)18-24
03-3xxBiology Electives at 300 level or above9-12
15-xxxComputer Science or 10-xxx Machine Learning Electives18-24
Total Units45-60

General Education (Humanities & Arts)

Expectations for Humanities & Arts courses are shared between the Computer Science and Computational Biology undergraduate programs.  For specific courses that may be used to satisfy each elective, please see the SCS General Education Requirements page.

76-101Interpretation and Argument9
Elective Cognition, Choice and Behavior9
Elective Economics, Political and Social Institutions9
Elective Cultural Analysis9
Non-technical Electives (x 3)27
Total Units63

Free Electives 

A free elective is any Carnegie Mellon course. However, a maximum of nine (9) units of Physical Education and/or Military Science (ROTC) and/or Student-Led (StuCo) courses may be used toward fulfilling graduation requirements.

Free Electives27-54
Total Units27-54

Sample Course Sequence

Please note that the below course sequence is simply a suggested guide to which courses may be appropriate for students completing the undergraduate program in computational biology in each term.  Individual students will have individual paths based on their backgrounds and needs.

FreshmanSophomore
FallSpringFallSpring
21-120 Differential and Integral Calculus02-250 Introduction to Computational Biology02-261 Quantitative Cell and Molecular Biology Laboratory15-251 Great Ideas in Theoretical Computer Science
15-122 Principles of Imperative Computation21-122 Integration and Approximation03-220 Genetics03-232 Biochemistry I
15-128 Freshman Immigration Course03-121 Modern Biology33-121 Physics I for Science Studentsxx-xxx Humanities and Arts Elective
15-131 Great Practical Ideas for Computer Scientists09-105 Introduction to Modern Chemistry I15-351 Algorithms and Advanced Data Structuresxx-xxx Free Elective
15-151 Mathematical Foundations for Computer Sciencexx-xxx Humanities and Arts Electivexx-xxx Humanities and Arts Elective
99-101 Computing @ Carnegie Mellon (or 99-102)
76-101 Interpretation and Argument

JuniorSenior
FallSpringFallSpring
02-512 Computational Methods for Biological Modeling and Simulation02-510 Computational Genomics02-402 Computational Biology Seminar02-xxx Computational Biology Elective
36-217 Probability Theory and Random Processes36-226 Introduction to Statistical Inference02-xxx Computational Biology Elective15-xxx/10-xxx Computer Science/Machine Learning Elective
03-320 Cell Biology10-401 Introduction to Machine Learning (Undergrad)03-xxx Biology Electivexx-xxx Humanities and Arts Elective
xx-xxx Humanities and Arts Electivexx-xxx Humanities and Arts Elective15-xxx/10-xxx Computer Science/Machine Learning Electivexx-xxx Free Elective
xx-xxx Free Electivexx-xxx Free Electivexx-xxx Free Elective

Computational Biology Minor

Director: Ziv Bar-Joseph, PhD
Advisor: Phillip Compeau, PhD
Program Manager: Nicole Stenger

The computational biology minor is open to students in any major of any college at Carnegie Mellon.  The curriculum and course requirements are designed to maximize the participation of students from diverse academic disciplines. The program seeks to produce students with both basic computational skills and knowledge in biological sciences that are central to computational biology.

Students are encouraged to declare the minor as early as possible in their undergraduate careers and in all cases before their final semester so that the minor advisor can provide advice on their curriculum.

Why Minor in Computational Biology?

Computational Biology is concerned with solving biological and biomedical problems using mathematical and computational methods. It is recognized as an essential element in modern biological and biomedical research. There have been fundamental changes in biology and medicine over the past two decades due to spectacular advances in high throughput data collection for genomics, proteomics and biomedical imaging. The resulting availability of unprecedented amounts of biological data demands the application of advanced computational tools to build integrated models of biological systems, and to use them to devise methods of prevent or treat disease. Computational Biologists inhabit and expand the interface of computation and biology, making them integral to the future of biology and medicine.

Computational Biology is a growing field not only in academia, but also in industry. Major players in computation and medicine have invested heavily in computational biology, including Google, Microsoft, Roche and Merck.

Policy on Double Counting

No more than two courses may be double counted with your major's core requirements. Courses in the minor may not be counted towards another SCS minor. Consult the minor advisor for more information.

Curriculum Overview

The minor in computational biology requires a total of five courses: 3 core courses, 1 biology elective, and 1 computational biology elective, for a total of at least 48 units.

Prerequisites

Students must take both of the following courses as prerequisites: Units
03-121Modern Biology9-10
or 03-151 Honors Modern Biology
15-122Principles of Imperative Computation10

Core Classes 

Students must take both of the following courses:
02-250Introduction to Computational Biology12
02-261Quantitative Cell and Molecular Biology Laboratory
(03-343 Experimental Techniques in Molecular Biology may be substituted for 02-261 with permission of the minor advisor; 03-115 and 03-116 may be used to replace 02-261 if and only if the latter is not offered)
9
Students must take one of the following courses:
02-510Computational GenomicsVar.
02-512Computational Methods for Biological Modeling and Simulation9
02-530Cell and Systems Modeling12

Biology Elective

Please select one of the following courses:
03-231Biochemistry I9
03-232Biochemistry I9
03-320Cell Biology9
03-327Phylogenetics9
03-330Genetics9
03-362Cellular Neuroscience9
03-363Systems Neuroscience9
03-364Developmental Neuroscience9
03-439Introduction to Biophysics9
03-442Molecular Biology9
03-534Biological Imaging and Fluorescence Spectroscopy9
42-202Physiology9

Computational Biology Elective

Please select one of the following courses:
02-xxxAny 02-xxx listed course 02-300 or above9-12
09-560Computational Chemistry12
15-386Neural Computation9
15-883Computational Models of Neural Systems12
16-725Medical Image Analysis12
42-640/24-658Computational Bio-Modeling and Visualization12

Computational Biology Courses

Note on Course Numbers

Each Carnegie Mellon course number begins with a two-digit prefix which designates the department offering the course (76-xxx courses are offered by the Department of English, etc.). Although each department maintains its own course numbering practices, typically the first digit after the prefix indicates the class level: xx-1xx courses are freshmen-level, xx-2xx courses are sophomore level, etc. xx-6xx courses may be either undergraduate senior-level or graduate-level, depending on the department. xx-7xx courses and higher are graduate-level. Please consult the Schedule of Classes each semester for course offerings and for any necessary pre-requisites or co-requisites.

02-201 Programming for Scientists
Fall and Spring: 10 units
Provides a practical introduction to programming for students with little or no prior programming experience who are interested in science. Fundamental scientific algorithms will be introduced, and extensive programming assignments will be based on analytical tasks that might be faced by scientists, such as parsing, simulation, and optimization. Principles of good software engineering will also be stressed. The course will introduce students to the Go programming language, an industry-supported, modern programming language, the syntax of which will be covered in depth. Other assignments will be given in other programming languages such as Python and Java to highlight the commonalities and differences between languages. No prior programming experience is assumed, and no biology background is needed. Analytical skills and mathematical maturity are required. Course not open to CS majors.
02-223 Personalized Medicine: Understanding Your Own Genome
Fall: 9 units
Do you want to know how to discover the tendencies hidden in your genome? Since the first draft of a human genome sequence became available at the start of this century, the cost of genome sequencing has decreased dramatically. Personal genome sequencing will likely become a routine part of medical exams for patients for prognostic and diagnostic purposes. Personal genome information will also play an increasing role in lifestyle choices, as people take into account their own genetic tendencies. Commercial services such as 23andMe have already taken first steps in this direction. Computational methods for mining large-scale genome data are being developed to unravel the genetic basis of diseases and assist doctors in clinics. This course introduces students to biological, computational, and ethical issues concerning use of personal genome information in health maintenance, medical practice, biomedical research, and policymaking. We focus on practical issues, using individual genome sequences (such as that of Nobel prize winner James Watson) and other population-level genome data. Without requiring any background in biology or CS, we begin with an overview of topics from genetics, molecular biology, stats, and machine learning relevant to the modern personal genome era. We then cover scientific issues such as how to discover your genetic ancestry and how to learn from genomes about migration and evolution of human populations. We discuss medical aspects such as how to predict whether you will develop diseases such as diabetes based on your own genome, how to discover disease-causing genetic mutations, and how genetic information can be used to recommend clinical treatments.
02-250 Introduction to Computational Biology
Spring: 12 units
This 12-unit class provides a general introduction to computational tools for biology. The course is divided into two modules. Module 1 covers computational molecular biology/genomics. It examines important sources of biological data, how they are archived and made available to researchers, and what computational tools are available to use them effectively in research. In the process, it covers basic concepts in statistics, mathematics, and computer science needed to effectively use these resources and understand their results. Specific topics covered include sequence data, searching and alignment, structural data, genome sequencing, genome analysis, genetic variation, gene and protein expression, and biological networks and pathways. Module 2 covers computational cell biology, including biological modeling and image analysis. It includes homework assignments requiring modification of scripts to perform computational analyses. The modeling component includes computer models of population dynamics, biochemical kinetics, cell pathways, neuron behavior, and stochastic simulations. The imaging component includes basics of machine vision, morphological image analysis, image classification and image-derived models. Lectures and examinations are joint with 03-250, but recitations are separate. Recitations for this course are intended primarily for computational biology majors as well as computer science, statistics or engineering majors at the undergraduate or graduate level who have had significant prior experience with computer science or programming. Students may not take both 02-250 and 03-250 for credit.
Prerequisites: 15-110 or 15-112 or 02-201
02-261 Quantitative Cell and Molecular Biology Laboratory
Fall and Spring: 9 units
This is an introductory laboratory-based course designed to teach basic biological laboratory skills used in exploring the quantitative nature of biological systems and the reasoning required for performing research in computational biology. Over the course of the semester, students will design and perform multiple modern experiments and quantitatively analyze the results of these experiments. During this course students will also have an opportunity to use techniques learned during the course to experimentally answer an open question. Designing the experiments will require students to think critically about the biological context of the experiments as well as the necessary controls to ensure interpretable experimental results. During this course students will gain experience in many aspects of scientific research, including: sequencing DNA, designing and performing PCR for a variety of analyses, maintaining cell cultures, taking brightfield and fluorescent microscopy images, developing methods for automated analysis of cell images, communicating results to peers and colleagues. As space is limited, laboratory sections will be small. Additional sections will be added to accommodate all students on the waitlist. Course Outline: (1) 3-hour lab per week, (1) 1-hour lecture per week.
02-317 Algorithms in Nature
Intermittent: 9 units
Computer systems and biological processes often rely on networks of interacting entities to reach joint decisions, coordinate and respond to inputs. There are many similarities in the goals and strategies of biological and computational systems which suggest that each can learn from the other. These include the distributed nature of the networks (in biology molecules, cells, or organisms often operate without central control), the ability to successfully handle failures and attacks on a subset of the nodes, modularity and the ability to reuse certain components or sub-networks in multiple applications and the use of stochasticity in biology and randomized algorithms in computer science. In this course we will start by discussing classic biologically motivated algorithms including neural networks (inspired by the brain), genetic algorithms (sequence evolution), non-negative matrix factorization (signal processing in the brain), and search optimization (ant colony formation). We will then continue to discuss more recent bi-directional studies that have relied on biological processes to solve routing and synchronization problems, discover Maximal Independent Sets (MIS), and design robust and fault tolerant networks. In the second part of the class students will read and present new research in this area. Students will also work in groups on a final project in which they develop and test a new biologically inspired algorithm. No prior biological knowledge required.
Prerequisites: 15-210 and 15-251
Course Website: http://www.algorithmsinnature.org
02-319 Genomics and Epigenetics of the Brain
Fall: 9 units
This course will provide an introduction to genomics, epigenetics, and their application to problems in neuroscience. The rapid advances in genomic technology are in the process of revolutionizing how we conduct molecular biology research. These new techniques have given us an appreciation for the role that epigenetics modifications of the genome play in gene regulation, development, and inheritance. In this course, we will cover the biological basis of genomics and epigenetics, the basic computational tools to analyze genomic data, and the application of those tools to neuroscience. Through programming assignments and reading primary literature, the material will also serve to demonstrate important concepts in neuroscience, including the diversity of neural cell types, neural plasticity, the role that epigenetics plays in behavior, and how the brain is influenced by neurological and psychiatric disorders. Although the course focuses on neuroscience, the material is accessible and applicable to a wide range of topics in biology. 02-250 is a suggested pre-requisite.
Prerequisites: (03-151 or 03-121) and 03-220 and (15-110 or 02-201 or 15-121)
02-402 Computational Biology Seminar
Fall and Spring: 3 units
This course consists of weekly invited presentations on current computational biology research topics by leading scientists. Students will be expected to digest what they have learned in the seminar by writing short summaries on each speaker's topic.
02-403 Special Topics in Bioinformatics and Computational Biology
Intermittent: 6 units
A decade ago, mass spectrometry (MS) was merely a qualitative research technique allowing the analysis of samples regarding the presence of specific biomolecules. However, as MS has turned quantitative, more sophisticated experiments can be performed, such as the recording of signal transduction kinetics and the analysis of the composition of protein complexes and organelles. This makes MS-based proteomics a powerful method to study spatiotemporal protein dynamics. The development of relative quantification approaches, which generally use 2H, 13C or 15N isotope labels, has especially led to an increase in quantification accuracy and set off numerous new experimental approaches to study protein regulation. In this mini-course, we will cover mass spectrometry principles, discuss classical as well as current primary literature addressing method development and quantitative analysis, and highlight state-of-the-art biological studies that employ MS. A combination of lectures, student presentations, and written exercises will establish a thorough knowledge of current bio-analytical MS approaches.
Prerequisites: (02-250 Min. grade C or 03-250 Min. grade C) and 03-121 Min. grade C
02-421 Algorithms for Computational Structural Biology
Intermittent: 12 units
Some of the most interesting and difficult challenges in computational biology and bioinformatics arise from the determination, manipulation, or exploitation of molecular structures. This course will survey these challenges and present a variety of computational methods for addressing them. Topics will include: molecular dynamics simulations, computer-aided drug design, and computer-aided protein design. The course is appropriate for both students with backgrounds in computer science and those in the life sciences.
02-425 Computational Methods for Proteomics and Metabolomics
Spring: 9 units
Proteomics and metabolomics are the large scale study of proteins and metabolites, respectively. In contrast to genomes, proteomes and metabolomes vary with time and the specific stress or conditions an organism is under. Applications of proteomics and metabolomics include determination of protein and metabolite functions (including in immunology and neurobiology) and discovery of biomarkers for disease. These applications require advanced computational methods to analyze experimental measurements, create models from them, and integrate with information from diverse sources. This course specifically covers computational mass spectrometry, structural proteomics, proteogenomics, metabolomics, genome mining and metagenomics. Prerequisites: 02-250 or 02-604.
Prerequisites: 02-250 or 02-604
02-450 Automation of Biological Research: Robotics and Machine Learning
Fall: 9 units
Biology is increasingly becoming a "big data" science, as biomedical research has been revolutionized by automated methods for generating large amounts of data on diverse biological processes. Integration of data from many types of experiments is required to construct detailed, predictive models of cell, tissue or organism behaviors, and the complexity of the systems suggests that these models need to be constructed automatically. This requires iterative cycles of acquisition, analysis, modeling, and experimental design, since it is not feasible to do all possible biological experiments. This course will cover a range of automated biological research methods and a range of computational methods for automating the acquisition and interpretation of the data (especially active learning, proactive learning, compressed sensing and model structure learning). Grading will be based on class participation, homeworks, and a final project. The course is designed for graduate and upper-level undergraduate students with a wide variety of backgrounds. The course is intended to be self-contained but students may need to do some additional work to gain fluency in core concepts. Students should have a basic knowledge of biology, statistics, and programming. Experience with Machine Learning is useful but not mandatory.
Prerequisites: 15-122 and 10-401
Course Website: https://sites.google.com/site/automationofbiologicalresearch/
02-499 Independent Study in Computational Biology
Fall and Spring
The student will, under the individual guidance of a faculty member, read and digest process papers or a textbook in an advanced area of computational biology not offered by an existing course at Carnegie Mellon. The student will demonstrate their mastery of the material by a combination of one or more of the following: oral discussions with the faculty member; exercises set by the faculty member accompanying the readings; and a written summary synthesizing the material that the student learned. Permission required.
02-500 Undergraduate Research in Computational Biology
Fall and Spring
This course is for undergraduate students who wish to do supervised research for academic credit with a Computational Biology faculty member. Interested students should first contact the Professor with whom they would like to work. If there is mutual interest, the Professor will direct you to the Academic Programs Coordinator and Asst Dept Head for Education.
02-510 Computational Genomics
Fall and Spring
Dramatic advances in experimental technology and computational analysis are fundamentally transforming the basic nature and goal of biological research. The emergence of new frontiers in biology, such as evolutionary genomics and systems biology is demanding new methodologies that can confront quantitative issues of substantial computational and mathematical sophistication. In this course we will discuss classical approaches and latest methodological advances in the context of the following biological problems: 1) sequence analysis, focusing on gene finding and motifs detection, 2) analysis of high throughput molecular data, such as gene expression data, including normalization, clustering, pattern recognition and classification, 3) molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, 4) population genetics, focusing on how genomes within a population evolve through recombination, mutation, and selection to create various structures in modern genomes and 5) systems biology, concerning how to combine diverse data types to make mechanistic inferences about biological processes. From the computational side this course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, data integration, time series analysis, active learning, etc. This course may be taken for 12 units, which requires completion of a course project, or for 9 units, which does not.
Prerequisites: 15-122 Min. grade C and (15-151 Min. grade C or 21-127 Min. grade C or 21-128 Min. grade C)
02-512 Computational Methods for Biological Modeling and Simulation
Fall: 9 units
This course covers a variety of computational methods important for modeling and simulation of biological systems. It is intended for graduates and advanced undergraduates with either biological or computational backgrounds who are interested in developing computer models and simulations of biological systems. The course will emphasize practical algorithms and algorithm design methods drawn from various disciplines of computer science and applied mathematics that are useful in biological applications. The general topics covered will be models for optimization problems, simulation and sampling, and parameter tuning. Course work will include problem sets with significant programming components and independent or group final projects.
Prerequisites: 02-201 or 15-112 or 15-110
02-514 String Algorithms
Fall: 12 units
Provides an in-depth look at modern algorithms used to process string data, particularly those relevant to genomics. The course will cover the design and analysis of efficient algorithms for processing enormous collections of strings. Topics will include string search; inexact matching; string compression; string data structures such as suffix trees, suffix arrays, and searchable compressed indices; and the Burrows-Wheeler transform. Applications of these techniques in biology will be presented, including genome assembly, transcript assembly, whole-genome alignment, gene expression quantification, read mapping, and search of large sequence databases. No knowledge of biology is assumed, and the topics covered will be of use in other fields involving large collections of strings. Programming proficiency is required.
Prerequisite: 15-251
02-518 Computational Medicine
Spring: 12 units
Modern medical research increasingly relies on the analysis of large patient datasets to enhance our understanding of human diseases. This course will focus on the computational problems that arise from studies of human diseases and the translation of research to the bedside to improve human health. The topics to be covered include computational strategies for advancing personalized medicine, pharmacogenomics for predicting individual drug responses, metagenomics for learning the role of the microbiome in human health, mining electronic medical records to identify disease phenotypes, and case studies in complex human diseases such as cancer and asthma. We will discuss how machine learning methodologies such as regression, classification, clustering, semi-supervised learning, probabilistic modeling, and time-series modeling are being used to analyze a variety of datasets collected by clinicians. Class sessions will consist of lectures, discussions of papers from the literature, and guest presentations by clinicians and other domain experts. Grading will be based on homework assignments and a project. 02-250 is a suggested pre-requisite.
Prerequisites: 10-401 or 10-601
02-530 Cell and Systems Modeling
Fall: 12 units
This course will introduce students to the theory and practice of modeling biological systems from the molecular to the organism level with an emphasis on intracellular processes. Topics covered include kinetic and equilibrium descriptions of biological processes, systematic approaches to model building and parameter estimation, analysis of biochemical circuits modeled as differential equations, modeling the effects of noise using stochastic methods, modeling spatial effects, and modeling at higher levels of abstraction or scale using logical or agent-based approaches. A range of biological models and applications will be considered including gene regulatory networks, cell signaling, and cell cycle regulation. Weekly lab sessions will provide students hands-on experience with methods and models presented in class. Course requirements include regular class participation, bi-weekly homework assignments, a take-home exam, and a final project. Prerequisites: The course is designed for graduate and upper-level undergraduate students with a wide variety of backgrounds. The course is intended to be self-contained but students may need to do some additional work to gain fluency in core concepts. Students should have a basic knowledge of calculus, differential equations, and chemistry as well as some previous exposure to molecular biology and biochemistry. Experience with programming and numerical computation is useful but not mandatory. Laboratory exercises will use MATLAB as the primary modeling and computational tool augmented by additional software as needed.
Prerequisites: (03-121 or 03-151 or 33-121) and (03-231 or 03-232) and 21-112 and 09-105
02-601 Programming for Scientists
Fall and Spring: 12 units
Provides a practical introduction to programming for students with little or no prior programming experience who are interested in science. Fundamental scientific algorithms will be introduced, and extensive programming assignments will be based on analytical tasks that might be faced by scientists, such as parsing, simulation, and optimization. Principles of good software engineering will also be stressed, and students will have the opportunity to design their own programming project on a scientific topic of their choice. The course will introduce students to the Go programming language, an industry-supported, modern programming language, the syntax of which will be covered in depth. Other assignments will be given in other programming languages such as Python and Java to highlight the commonalities and differences between languages. No prior programming experience is assumed, and no biology background is needed. Analytical skills and mathematical maturity are required. Course not open to CS majors.
02-602 Professional Issues in Computational Biology
Fall and Spring: 1 unit
This course gives MS in Computational Biology students an opportunity to develop professional skills necessary for a successful career in computational biology. This course will include assistance with resume writing, interview preparation, presentation skills, and job search techniques. This course will also include opportunities to network with computational biology professionals and academic researchers. This course will meet once a week. This course is pass/fail only. Grading scheme will be discussed on first day of class.
02-604 Fundamentals of Bioinformatics
Spring: 12 units
How do we find potentially harmful mutations in your genome? How can we reconstruct the Tree of Life? How do we compare similar genes from different species? These are just three of the many central questions of modern biology that can only be answered using computational approaches. This 12-unit course will delve into some of the fundamental computational ideas used in biology and let students apply existing resources that are used in practice every day by thousands of biologists. The course offers an opportunity for students who possess an introductory programming background to become more experienced coders within a biological setting. As such, it presents a natural next course for students who have completed 02-601. 02-250 is a suggested pre-requisite for undergraduates.
02-613 Algorithms and Advanced Data Structures
Fall and Spring: 12 units
The objective of this course is to study algorithms for general computational problems, with a focus on the principles used to design those algorithms. Efficient data structures will be discussed to support these algorithmic concepts. Topics include: Run time analysis, divide-and-conquer algorithms, dynamic programming algorithms, network flow algorithms, linear and integer programming, large-scale search algorithms and heuristics, efficient data storage and query, and NP-completeness. Although this course may have a few programming assignments, it is primarily not a programming course. Instead, it will focus on the design and analysis of algorithms for general classes of problems. This course is not open to CS graduate students who should consider taking 15-651 instead. 02-250 is a suggested prerequisite for undergraduates.
02-651 New Technologies and Future Markets
Fall: 12 units
This course focuses on technological trends and how these trends can help shape or disrupt new and existing markets. Students will learn to identify, analyze, and synthesize emerging trends and perform detailed research on how these trends can influence and create markets. By understand the drivers behind these trends students will be able to identify key market opportunity inflection points in biotechnology as well as the relationship between business processes and information technology (IT). Students will also learn to assess some information technologies and the potential of applying them to solve problems and create commercially viable solutions. The course is designed for the student interested in finding new venture opportunities on the cutting edge of technology and finding and evaluating the opportunities for further development. For MS Biotechnology Innovation and Computation students only.
Prerequisite: 11-695
02-654 Biotechnology Enterprise Development
Fall: 12 units
In this course students learn how to develop a biotech start-up, create a Minimum Viable Product (MVP), business model and strategy for the product. Students will learn about business modeling, customer development, customer validation, proposal, product branding, and marketing for their product. The course will require students to spend most time to validate their start up concept and prototypes with potential customers and adapt to critical feedback and revise their respective value propositions accordingly. Students learn to balance technical product development with customer requirements, business strategy and budget constraints. This course provides real world, hands-on learning on what it is like to start a company. Different business modeling will be covered. By understand customer discovery and validation concepts will aid students to effectively modify their original concepts to meet market demands. Student teams will learn how to revise, improve their prototype by the end of the term. This is a fast paced course in which students are expected to spend most of the time outside of the classroom to interact with potential customers to validate, test, verify, and integrate essentials elements for their start-up business proposal. Up to now, students have been learning some technologies and methods for solving problems in the life science industry and build a prototype for their start-up. However, a new venture proposal is not a collection of isolated bits. It should be thorough validated via customer's inputs and market needs to tell a single story of how the venture will reach its end goals. Final deliverable is creation and presentation of a well explicated, business proposal in addition to a product prototype corresponding to the business proposal.
Prerequisites: 11-695 and 02-651
02-699 Independent Study in Computational Biology
Fall and Spring
The student will, under the individual guidance of a faculty member, read and digest process papers or a textbook in an advanced area of computational biology not offered by an existing course at Carnegie Mellon. The student will demonstrate their mastery of the material by a combination of one or more of the following: oral discussions with the faculty member; exercises set by the faculty member accompanying the readings; and a written summary synthesizing the material that the student learned. Permission required.
02-700 M.S. Thesis Research
Fall and Spring
This course is for M.S. students who wish to do supervised research for academic credit with a Computational Biology faculty member. Interested students should first contact the Professor with whom they would like to work. If there is mutual interest, the Professor will direct you to the Academic Programs Coordinator, who will enroll you in the course.
02-701 Current Topics in Computational Biology
Fall and Spring: 3 units
The course consists of weekly presentations by students and faculty on current topics in computational biology.
02-702 Computational Biology Seminar
Fall and Spring: 3 units
This course consists of weekly invited presentations on current computational biology research topics by leading scientists. Attendance is mandatory for a passing grade.
02-703 Special Topics in Bioinformatics and Computational Biology
Intermittent: 6 units
A decade ago, mass spectrometry (MS) was merely a qualitative research technique allowing the analysis of samples regarding the presence of specific biomolecules. However, as MS has turned quantitative, more sophisticated experiments can be performed, such as the recording of signal transduction kinetics and the analysis of the composition of protein complexes and organelles. This makes MS-based proteomics a powerful method to study spatiotemporal protein dynamics. The development of relative quantification approaches, which generally use 2H, 13C or 15N isotope labels, has especially led to an increase in quantification accuracy and set off numerous new experimental approaches to study protein regulation. In this mini-course, we will cover mass spectrometry principles, discuss classical as well as current primary literature addressing method development and quantitative analysis, and highlight state-of-the-art biological studies that employ MS. A combination of lectures, student presentations, and written exercises will establish a thorough knowledge of current bio-analytical MS approaches.
02-710 Computational Genomics
Spring: 12 units
Dramatic advances in experimental technology and computational analysis are fundamentally transforming the basic nature and goal of biological research. The emergence of new frontiers in biology, such as evolutionary genomics and systems biology is demanding new methodologies that can confront quantitative issues of substantial computational and mathematical sophistication. In this course we will discuss classical approaches and latest methodological advances in the context of the following biological problems: 1) sequence analysis, focusing on gene finding and motifs detection, 2) analysis of high throughput molecular data, such as gene expression data, including normalization, clustering, pattern recognition and classification, 3) molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, 4) population genetics, focusing on how genomes within a population evolve through recombination, mutation, and selection to create various structures in modern genomes and 5) systems biology, concerning how to combine diverse data types to make mechanistic inferences about biological processes. From the computational side this course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, data integration, time series analysis, active learning, etc.
02-711 Computational Molecular Biology and Genomics
Spring: 12 units
An advanced introduction to computational molecular biology, using an applied algorithms approach. The first part of the course will cover established algorithmic methods, including pairwise sequence alignment and dynamic programming, multiple sequence alignment, fast database search heuristics, hidden Markov models for molecular motifs and phylogeny reconstruction. The second part of the course will explore emerging computational problems driven by the newest genomic research. Course work includes four to six problem sets, one midterm and final exam.
Prerequisites: (03-151 or 03-121) and 15-122
02-712 Computational Methods for Biological Modeling and Simulation
Fall: 12 units
This course covers a variety of computational methods important for modeling and simulation of biological systems. It is intended for graduates and advanced undergraduates with either biological or computational backgrounds who are interested in developing computer models and simulations of biological systems. The course will emphasize practical algorithms and algorithm design methods drawn from various disciplines of computer science and applied mathematics that are useful in biological applications. The general topics covered will be models for optimization problems, simulation and sampling, and parameter tuning. Course work will include problem sets with significant programming components and independent or group final projects.
Prerequisites: (15-110 or 15-112) and (02-613 or 02-201)
02-714 String Algorithms
Fall: 12 units
Provides an in-depth look at modern algorithms used to process string data, particularly those relevant to genomics. The course will cover the design and analysis of efficient algorithms for processing enormous collections of strings. Topics will include string search; inexact matching; string compression; string data structures such as suffix trees, suffix arrays, and searchable compressed indices; and the Borrows-Wheeler transform. Applications of these techniques in biology will be presented, including genome assembly, transcript assembly, whole-genome alignment, gene expression quantification, read mapping, and search of large sequence databases. No knowledge of biology is assumed, and the topics covered will be of use in other fields involving large collections of strings. Programming proficiency is required.
Prerequisite: 15-251
02-717 Algorithms in Nature
Fall: 12 units
Computer systems and biological processes often rely on networks of interacting entities to reach joint decisions, coordinate and respond to inputs. There are many similarities in the goals and strategies of biological and computational systems which suggest that each can learn from the other. These include the distributed nature of the networks (in biology molecules, cells, or organisms often operate without central control), the ability to successfully handle failures and attacks on a subset of the nodes, modularity and the ability to reuse certain components or sub-networks in multiple applications and the use of stochasticity in biology and randomized algorithms in computer science. In this course we will start by discussing classic biologically motivated algorithms including neural networks (inspired by the brain), genetic algorithms (sequence evolution), non-negative matrix factorization (signal processing in the brain), and search optimization (ant colony formation). We will then continue to discuss more recent bi-directional studies that have relied on biological processes to solve routing and synchronization problems, discover Maximal Independent Sets (MIS), and design robust and fault tolerant networks. In the second part of the class students will read and present new research in this area. Students will also work in groups on a final project in which they develop and test a new biologically inspired algorithm. See also: www.algorithmsinnature.org no prior biological knowledge required.
02-718 Computational Medicine
Spring: 12 units
Modern medical research increasingly relies on the analysis of large patient datasets to enhance our understanding of human diseases. This course will focus on the computational problems that arise from studies of human diseases and the translation of research to the bedside to improve human health. The topics to be covered include computational strategies for advancing personalized medicine, pharmacogenomics for predicting individual drug responses, metagenomics for learning the role of the microbiome in human health, mining electronic medical records to identify disease phenotypes, and case studies in complex human diseases such as cancer and asthma. We will discuss how machine learning methodologies such as regression, classification, clustering, semi-supervised learning, probabilistic modeling, and time-series modeling are being used to analyze a variety of datasets collected by clinicians. Class sessions will consist of lectures, discussions of papers from the literature, and guest presentations by clinicians and other domain experts. Grading will be based on homework assignments and a project. 02-250 is a suggested pre-requisite.
Prerequisites: 10-401 or (10-601 and 10-701)
02-719 Genomics and Epigenetics of the Brain
Fall: 12 units
This course will provide an introduction to genomics, epigenetics, and their application to problems in neuroscience. The rapid advances in genomic technology are in the process of revolutionizing how we conduct molecular biology research. These new techniques have given us an appreciation for the role that epigenetics modifications of the genome play in gene regulation, development, and inheritance. In this course, we will cover the biological basis of genomics and epigenetics, the basic computational tools to analyze genomic data, and the application of those tools to neuroscience. Through programming assignments and reading primary literature, the material will also serve to demonstrate important concepts in neuroscience, including the diversity of neural cell types, neural plasticity, the role that epigenetics plays in behavior, and how the brain is influenced by neurological and psychiatric disorders. Although the course focuses on neuroscience, the material is accessible and applicable to a wide range of topics in biology. 02-250 is a suggested pre-requisite.
Prerequisites: (03-151 or 03-121) and 03-220 and (15-110 or 02-201 or 15-121)
02-721 Algorithms for Computational Structural Biology
Intermittent: 12 units
Some of the most interesting and difficult challenges in computational biology and bioinformatics arise from the determination, manipulation, or exploitation of molecular structures. This course will survey these challenges and present a variety of computational methods for addressing them. Topics will include: molecular dynamics simulations, computer-aided drug design, and computer-aided protein design. The course is appropriate for both students with backgrounds in computer science and those in the life sciences.
02-725 Computational Methods for Proteomics and Metabolomics
Spring: 12 units
Proteomics and metabolomics are the large scale study of proteins and metabolites, respectively. In contrast to genomes, proteomes and metabolomes vary with time and the specific stress or conditions an organism is under. Applications of proteomics and metabolomics include determination of protein and metabolite functions (including in immunology and neurobiology) and discovery of biomarkers for disease. These applications require advanced computational methods to analyze experimental measurements, create models from them, and integrate with information from diverse sources. This course specifically covers computational mass spectrometry, structural proteomics, proteogenomics, metabolomics, genome mining and metagenomics. Prerequisites: 02-250 or 02-604.
Prerequisites: 02-604 or 02-250
02-730 Cell and Systems Modeling
Fall: 12 units
This course will introduce students to the theory and practice of modeling biological systems from the molecular to the organism level with an emphasis on intracellular processes. Topics covered include kinetic and equilibrium descriptions of biological processes, systematic approaches to model building and parameter estimation, analysis of biochemical circuits modeled as differential equations, modeling the effects of noise using stochastic methods, modeling spatial effects, and modeling at higher levels of abstraction or scale using logical or agent-based approaches. A range of biological models and applications will be considered including gene regulatory networks, cell signaling, and cell cycle regulation. Weekly lab sessions will provide students hands-on experience with methods and models presented in class. Course requirements include regular class participation, bi-weekly homework assignments, a take-home exam, and a final project. Prerequisites: The course is designed for graduate and upper-level undergraduate students with a wide variety of backgrounds. The course is intended to be self-contained but students may need to do some additional work to gain fluency in core concepts. Students should have a basic knowledge of calculus, differential equations, and chemistry as well as some previous exposure to molecular biology and biochemistry. Experience with programming and numerical computation is useful but not mandatory. Laboratory exercises will use MATLAB as the primary modeling and computational tool augmented by additional software as needed.
Prerequisites: (33-121 or 03-151 or 03-121) and (03-232 or 03-231) and 21-112 and 09-105
02-740 Bioimage Informatics
Spring: 12 units
The goals of this course are to provide students with the following: the ability to use mathematical techniques such as linear algebra. Fourier theory and sampling in more advanced signal processing settings; fundamentals of multiresolution and wavelet techniques; and in-depth coverage of some bioimaging applications such as compression and denoising. Upon successful completion of this course, the student will be able to: explain the importance and use of signal representations in building more sophisticated signal processing tools, such as wavelets; think in basic time-frequency terms; describe how Fourier theory fits in a bigger picture of signal representations; use basic multirate building blocks, such as a two-channel filter bank; characterize the discrete wavelet transform and its variations; construct a time-frequency decomposition to fit a given signal; explain how these tools are used in various applications; and apply these concepts to solve a practical bioimaging problem through an independent project. Pre-requisite: 18-791, or permission of instructor. (Also known as 18-799) 02-250 is a suggested pre-requisite.
Prerequisites: 18-791 or 18-799
02-750 Automation of Biological Research: Robotics and Machine Learning
Fall: 12 units
Biology is increasingly becoming a "big data" science, as biomedical research has been revolutionized by automated methods for generating large amounts of data on diverse biological processes. Integration of data from many types of experiments is required to construct detailed, predictive models of cell, tissue or organism behaviors, and the complexity of the systems suggests that these models need to be constructed automatically. This requires iterative cycles of acquisition, analysis, modeling, and experimental design, since it is not feasible to do all possible biological experiments. This course will cover a range of automated biological research methods and a range of computational methods for automating the acquisition and interpretation of the data (especially active learning, proactive learning, compressed sensing and model structure learning). Grading will be based on class participation, homeworks, and a final project. The course is designed for graduate and upper-level undergraduate students with a wide variety of backgrounds. The course is intended to be self-contained but students may need to do some additional work to gain fluency in core concepts. Students should have a basic knowledge of biology, statistics, and programming.
Prerequisites: 10-601 or 10-701
Course Website: https://sites.google.com/site/automationofbiologicalresearch/?pli=1
02-760 Laboratory Methods for Computational Biologists
Spring: 6 units
Computational biologists frequently focus on analyzing and modeling large amounts of biological data, often from high-throughput assays or diverse sources. It is therefore critical that students training in computational biology be familiar with the paradigms and methods of experimentation and measurement that lead to the production of these data. This one-semester laboratory course gives students a deeper appreciation of the principles and challenges of biological experimentation. Students learn a range of topics, including experimental design, structural biology, next generation sequencing, genomics, proteomics, bioimaging, and high-content screening. Class sessions are primarily devoted to designing and performing experiments in the lab using the above techniques. Students are required to keep a detailed laboratory notebook of their experiments and summarize their resulting data in written abstracts and oral presentations given in class-hosted lab meetings. With an emphasis on the basics of experimentation and broad views of multiple cutting-edge and high-throughput techniques, this course is appropriate for students who have never taken a traditional undergraduate biology lab course, as well as those who have and are looking for introductory training in more advanced approaches. Grading: Letter grade based on class participation, laboratory notebooks, experimental design assignments, and written and oral presentations. 02-250 is a suggested pre-requisite.
02-801 Computational Biology Internship
Fall and Summer: 3 units
This course is for students participating in an internship or co-op.
02-900 Ph.D. Thesis Research
All Semesters
This course is for Ph.D students doing supervised research for academic credit.