Bi/BE/CS 183, Winter 2024: Difference between revisions
No edit summary |
No edit summary |
||
Line 28: | Line 28: | ||
| '''Week 1'''<br> | | '''Week 1'''<br> | ||
3 Jan <br> 5 Jan <br> | 3 Jan <br> 5 Jan <br> | ||
| Course Introduction | | '''Course Introduction''' | ||
* Overview of computational biology | * Overview of computational biology | ||
* Logistics for the course | * Logistics for the course | ||
Line 49: | Line 49: | ||
| '''Week 2'''<br> | | '''Week 2'''<br> | ||
8 Jan <br> 10 Jan <br> 12 Jan | 8 Jan <br> 10 Jan <br> 12 Jan | ||
| Correlation and regresssion | | '''Correlation and regresssion''' | ||
* Linear and logistic regression, least squares | * Linear and logistic regression, least squares | ||
* Random variables, covariance, correlation | * Random variables, covariance, correlation | ||
Line 68: | Line 68: | ||
| '''Week 3'''<br> | | '''Week 3'''<br> | ||
<s>15 Jan</s> <br> 17 Jan <br> 19 Jan | <s>15 Jan</s> <br> 17 Jan <br> 19 Jan | ||
| Dimensionality reduction | | '''Dimensionality reduction''' | ||
* Singular value decomposition (SVD), principal components analysis (PCA) | * Singular value decomposition (SVD), principal components analysis (PCA) | ||
* Clustering and data visualization (PCA, t-SNE, UMAP) | * Clustering and data visualization (PCA, t-SNE, UMAP) | ||
Line 86: | Line 86: | ||
| '''Week 4'''<br> | | '''Week 4'''<br> | ||
22 Jan <br> 24 Jan <br> 26 Jan* | 22 Jan <br> 24 Jan <br> 26 Jan* | ||
| Expectation maximization (EM) | | '''Expectation maximization (EM)''' | ||
* Maximum likelihood estimation (MLE) | * Maximum likelihood estimation (MLE) | ||
* Clustering via EM | * Clustering via EM | ||
Line 106: | Line 106: | ||
| '''Week 5'''<br> | | '''Week 5'''<br> | ||
29 Jan <br> 31 Jan <br> 2 Feb | 29 Jan <br> 31 Jan <br> 2 Feb | ||
| Read alignment and modeling counts | | '''Read alignment and modeling counts''' | ||
* Read alignment via EM | * Read alignment via EM | ||
* String algorithms, suffix trees | * String algorithms, suffix trees | ||
Line 125: | Line 125: | ||
| '''Week 6'''<br> | | '''Week 6'''<br> | ||
5 Feb <br> 7 Feb <br> 9 Feb | 5 Feb <br> 7 Feb <br> 9 Feb | ||
| Variance stabilization | | '''Variance stabilization''' | ||
* Normalization, log1p transformations | * Normalization, log1p transformations | ||
| | | | ||
Line 142: | Line 142: | ||
| '''Week 7'''<br> | | '''Week 7'''<br> | ||
12 Feb <br> 14 Feb <br> 16 Feb | 12 Feb <br> 14 Feb <br> 16 Feb | ||
| Differential analysis | | '''Differential analysis''' | ||
* Hypothesis testing | * Hypothesis testing | ||
| | | | ||
Line 159: | Line 159: | ||
| '''Week 8'''<br> | | '''Week 8'''<br> | ||
<s>19 Feb</s> <br> 21 Feb <br> 23 Feb* | <s>19 Feb</s> <br> 21 Feb <br> 23 Feb* | ||
| Hidden Markov models | | '''Hidden Markov models''' | ||
* Global and local alignment (Needleman-Wunsch, Smith-Waterman) | * Global and local alignment (Needleman-Wunsch, Smith-Waterman) | ||
* Dynamic programming | * Dynamic programming | ||
Line 177: | Line 177: | ||
| '''Week 9'''<br> | | '''Week 9'''<br> | ||
26 Feb <br> 28 Feb <br> 1 Mar | 26 Feb <br> 28 Feb <br> 1 Mar | ||
| | | '''Markov processes''' | ||
* Continuous-time Markov chains | * Continuous-time Markov chains | ||
* Stochastic simulation algorithm | * Stochastic simulation algorithm | ||
Line 196: | Line 196: | ||
| '''Week 10'''<br> | | '''Week 10'''<br> | ||
4 Mar <br> 6 Mar <br> 8 Mar | 4 Mar <br> 6 Mar <br> 8 Mar | ||
| Machine learning | | '''Machine learning''' | ||
* Overview of alphaFold, ESM2 | * Overview of alphaFold, ESM2 | ||
| | | |
Revision as of 15:56, 4 December 2023
Introduction to Computational Biology and Bioinformatics | |
Instructors
|
Teaching Assistants
|
This is the course homepage for Bi/BE/CS 183, Winter 2024. This course closely follows the Winter 2023 course.
Catalog Description
Bi/BE/CS 183. Introduction to Computational Biology and Bioinformatics. 9 units (3-0-6): second term. Prerequisites: Bi 8, CS 2, Ma 3; or BE/Bi 103 a; or instructor's permission. Biology is becoming an increasingly data-intensive science. Many of the data challenges in the biological sciences are distinct from other scientific disciplines because of the complexity involved. This course will introduce key computational, probabilistic, and statistical methods that are common in computational biology and bioinformatics. We will integrate these theoretical aspects to discuss solutions to common challenges that reoccur throughout bioinformatics including algorithms and heuristics for tackling DNA sequence alignments, phylogenetic reconstructions, evolutionary analysis, and population and human genetics. We will discuss these topics in conjunction with common applications including the analysis of high throughput DNA sequencing data sets and analysis of gene expression from RNA-Seq data sets.
Lecture Schedule
Date | Topic | Reading | Homework |
Week 1 3 Jan |
Course Introduction
|
HW #1
Out: 3 Jan | |
Week 2 8 Jan |
Correlation and regresssion
|
HW #2
Out: 10 Jan | |
Week 3
|
Dimensionality reduction
|
HW #3
Out: 17 Jan | |
Week 4 22 Jan |
Expectation maximization (EM)
|
HW #4
Out: 24 Jan
| |
Week 5 29 Jan |
Read alignment and modeling counts
|
HW #5
Out: 31 Jan | |
Week 6 5 Feb |
Variance stabilization
|
HW #6
Out: 7 Feb | |
Week 7 12 Feb |
Differential analysis
|
HW #7
Out: 14 Feb | |
Week 8
|
Hidden Markov models
|
HW #8
Out: 21 Feb | |
Week 9 26 Feb |
Markov processes
|
HW #9
Out: 28 Feb | |
Week 10 4 Mar |
Machine learning
|
Final
Out: 8 Mar |
Grading
The final grade will be based on homework sets and a final exam:
- Homework (70%): Homework sets will be handed out weekly and due on Wednesdays by 11 am using GradeScope. Each student is allowed up to two extensions of no more than 2 days each over the course of the term. Homework turned in after Friday at 11 am or after the two extensions are exhausted will not be accepted without a note from the health center or the Dean. Python code is considered part of your solution and should be printed and turned in with the problem set (whether the problem asks for it or not).
- The lowest homework set grade will be dropped when computing your final grade.
- Final exam (30%): The final exam will be handed out on the last day of class (8 Mar) and due at the end of finals week. It will be an open book exam and computers will be allowed.
Collaboration Policy
Collaboration on homework assignments is encouraged. You may consult outside reference materials, other students, the TA, or the instructor, but you cannot consult homework solutions from prior years and you must cite any use of material from outside references. All solutions that are handed in should be written up individually and should reflect your own understanding of the subject matter at the time of writing. Any computer code that is used to solve homework problems is considered part of your writeup and should be done individually (you can share ideas, but not code).
No collaboration is allowed on the final exam.
Course Text and References
There is no course textbook, but the slides from the prior year's course serve as a reference for much of the material in the course:
- [Pac23] L. Pachter, Caltech BI/BE/CSS 183: Introduction to Computational Biology and Bioinformatics, Winter 2023.
The following additional references may also be useful:
- TBD
- TBD
Note: the only sources listed here are those that allow free access to online versions. Additional textbooks that are not freely available can be obtained from the library.