Bi/BE/CS 183, Winter 2024: Difference between revisions
No edit summary |
No edit summary |
||
Line 14: | Line 14: | ||
=== Catalog Description === | === Catalog Description === | ||
'''Bi/BE/CS 183. Introduction to Computational Biology and Bioinformatics.''' 9 units (3-0-6): second term. Prerequisites: Bi 8, CS 2, Ma 3; or BE/Bi 103 a; or instructor's permission. | '''Bi/BE/CS 183. Introduction to Computational Biology and Bioinformatics.''' 9 units (3-0-6): second term. Prerequisites: Bi 8, CS 2, Ma 3; or BE/Bi 103 a; or instructor's permission. Biology is becoming an increasingly data-intensive science. Many of the data challenges in the biological sciences are distinct from other scientific disciplines because of the complexity involved. This course will introduce key computational, probabilistic, and statistical methods that are common in computational biology and bioinformatics. We will integrate these theoretical aspects to discuss solutions to common challenges that reoccur throughout bioinformatics including algorithms and heuristics for tackling DNA sequence alignments, phylogenetic reconstructions, evolutionary analysis, and population and human genetics. We will discuss these topics in conjunction with common applications including the analysis of high throughput DNA sequencing data sets and analysis of gene expression from RNA-Seq data sets. | ||
Biology is becoming an increasingly data-intensive science. Many of the data challenges in the biological sciences are distinct from other scientific disciplines because of the complexity involved. This course will introduce key computational, probabilistic, and statistical methods that are common in computational biology and bioinformatics. We will integrate these theoretical aspects to discuss solutions to common challenges that reoccur throughout bioinformatics including algorithms and heuristics for tackling DNA sequence alignments, phylogenetic reconstructions, evolutionary analysis, and population and human genetics. We will discuss these topics in conjunction with common applications including the analysis of high throughput DNA sequencing data sets and analysis of gene expression from RNA-Seq data sets. | |||
=== Lecture Schedule === | === Lecture Schedule === | ||
Line 39: | Line 37: | ||
** [[http:docs.google.com/presentation/d/1-Bo7yaaaxbf8ul_2gacZIII_pggd6JmSVXEF4Zwqekg|Lecture 1]]: Introduction to computational biology of single-cell RNA-seq | ** [[http:docs.google.com/presentation/d/1-Bo7yaaaxbf8ul_2gacZIII_pggd6JmSVXEF4Zwqekg|Lecture 1]]: Introduction to computational biology of single-cell RNA-seq | ||
** [[http:docs.google.com/presentation/d/1tpeNHSONBunT7TwZSlwm42VknPVNeufdY7A4rzqMhkI|Lecture 2]]: Single-cell RNA-seq technology | ** [[http:docs.google.com/presentation/d/1tpeNHSONBunT7TwZSlwm42VknPVNeufdY7A4rzqMhkI|Lecture 2]]: Single-cell RNA-seq technology | ||
<!-- | |||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 54: | Line 54: | ||
* Exploratory data analysis | * Exploratory data analysis | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 70: | Line 72: | ||
* Clustering and data visualiziations (PCA, t-SNE, UMAP) | * Clustering and data visualiziations (PCA, t-SNE, UMAP) | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 86: | Line 90: | ||
* Clustering via EM | * Clustering via EM | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 104: | Line 110: | ||
* Modeling counts, zero-inflated negative binomial distribution | * Modeling counts, zero-inflated negative binomial distribution | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 120: | Line 128: | ||
* normalization, log1p transformations | * normalization, log1p transformations | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 136: | Line 146: | ||
* Hypothesis testing | * Hypothesis testing | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 152: | Line 164: | ||
* Dynamic programming | * Dynamic programming | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 170: | Line 184: | ||
* Bursty gene expression | * Bursty gene expression | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> | ||
Line 185: | Line 201: | ||
* Overview of alphaFold, ESM2 | * Overview of alphaFold, ESM2 | ||
| | | | ||
<!-- | |||
* Wi 2024 lecture slides: Mon, Wed, Fri | * Wi 2024 lecture slides: Mon, Wed, Fri | ||
* Wi 2023 lecture slides: | * Wi 2023 lecture slides: | ||
* Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | * Jupyter notebooks: {{Bi/BE/CS 183 public|example.ipynb}} | ||
* Optional reading: | * Optional reading: | ||
--> | |||
| {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | | {{Bi/BE/CS 183 public|hw1-wi2024.pdf|HW #1}} | ||
Out: 3 Jan <br> | Out: 3 Jan <br> |
Revision as of 00:21, 4 December 2023
Introduction to Computational Biology and Bioinformatics | |
Instructors
|
Teaching Assistants
|
This is the course homepage for Bi/BE/CS 183, Winter 2024. This course closely follows the Winter 2023 course.
Catalog Description
Bi/BE/CS 183. Introduction to Computational Biology and Bioinformatics. 9 units (3-0-6): second term. Prerequisites: Bi 8, CS 2, Ma 3; or BE/Bi 103 a; or instructor's permission. Biology is becoming an increasingly data-intensive science. Many of the data challenges in the biological sciences are distinct from other scientific disciplines because of the complexity involved. This course will introduce key computational, probabilistic, and statistical methods that are common in computational biology and bioinformatics. We will integrate these theoretical aspects to discuss solutions to common challenges that reoccur throughout bioinformatics including algorithms and heuristics for tackling DNA sequence alignments, phylogenetic reconstructions, evolutionary analysis, and population and human genetics. We will discuss these topics in conjunction with common applications including the analysis of high throughput DNA sequencing data sets and analysis of gene expression from RNA-Seq data sets.
Lecture Schedule
Date | Topic | Reading | Homework |
Week 1 3 Jan |
Course Introduction
|
HW #1
Out: 3 Jan | |
Week 2 8 Jan |
Correlation and regresssion
|
HW #1
Out: 3 Jan | |
Week 3
|
Dimensionality reduction
|
HW #1
Out: 3 Jan | |
Week 4 22 Jan |
Expectation maximization (EM)
|
HW #1
Out: 3 Jan
| |
Week 5 29 Jan |
Read alignment
|
HW #1
Out: 3 Jan
| |
Week 6 5 Feb |
Variance stabilization
|
HW #1
Out: 3 Jan
| |
Week 7 12 Feb |
Differential analysis
|
HW #1
Out: 3 Jan | |
Week 8
|
Hidden Markov models
|
HW #1
Out: 3 Jan
| |
Week 9 26 Feb |
Dynamic modeling (?)
|
HW #1
Out: 3 Jan | |
Week 10 4 Mar |
Machine learning
|
HW #1
Out: 3 Jan |
Grading
The final grade will be based on homework sets and a final exam:
- Homework (70%): Homework sets will be handed out weekly and due on Wednesdays by 11 am using GradeScope. Each student is allowed up to two extensions of no more than 2 days each over the course of the term. Homework turned in after Friday at 11 am or after the two extensions are exhausted will not be accepted without a note from the health center or the Dean. Python code is considered part of your solution and should be printed and turned in with the problem set (whether the problem asks for it or not).
- The lowest homework set grade will be dropped when computing your final grade.
- Final exam (30%): The final exam will be handed out on the last day of class (8 Mar) and due at the end of finals week. It will be an open book exam and computers will be allowed.
Collaboration Policy
Collaboration on homework assignments is encouraged. You may consult outside reference materials, other students, the TA, or the instructor, but you cannot consult homework solutions from prior years and you must cite any use of material from outside references. All solutions that are handed in should be written up individually and should reflect your own understanding of the subject matter at the time of writing. Any computer code that is used to solve homework problems is considered part of your writeup and should be done individually (you can share ideas, but not code).
No collaboration is allowed on the final exam.
Course Text and References
There is no course textbook, but the slides from the prior year's course serve as a reference for much of the material in the course:
- [Pac23] L. Pachter, Caltech BI/BE/CSS 183: Introduction to Computational Biology and Bioinformatics, Winter 2023.
The following additional references may also be useful:
- TBD
- TBD
Note: the only sources listed here are those that allow free access to online versions. Additional textbooks that are not freely available can be obtained from the library.