Machine Learning Algorithms for Bioinformatics
Table of Contents
I. Introduction
Biological data in digital symbol sequences
Genomes--diversity, size, & structure
Proteins & proteomes
On the information content of biological sequences
Prediction of molecular function & structure
II. Machine Learning Foundations: The Probabilistic FrameworkIntroduction: Bayesian modeling
The Cox Jaynes axioms
Bayesian inference & induction
Model structures: graphical models & other tricks
Summary
III. Probabilistic Modeling and Inference: ExamplesThe simplest sequence models
Statistical mechanics
IV. Machine Learning AlgorithmsIntroduction
Dynamic programming
Gradient descent
EM/GEM algorithms
Markov-chain Monte-Carlo methods
Simulated annealing
Evolutionary & genetic algorithms
Learning algorithms: miscellaneous aspects
V. Neural Networks: The TheoryIntroduction
Universal approximation properties
Priors & likelihoods
Learning algorithms: backpropagation
VI. Neural Networks: Applications
Sequence encoding & output interpretation
Sequence correlations & neural networks
Prediction of protein secondary structure
Prediction of signal peptides & their cleavage sites
Applications for DNA & RNA nucleotide sequences
Prediction performance evaluation
Different performance measures
VII. HiddenMarkov Models: The Theory
Introduction
Prior information & initialization
Likelihood & basic algorithms
Learning algorithms
Applications of HMMs: general aspects
VIII. Hidden Markov Models: ApplicationsProtein applications
DNA & RNA applications
Advantages & limitations of HMMs
IX. Hybrid Systems: Hidden Markov Models and Neural NetworksThe zoo of graphical models in bioinformatics Markov models & DNA symmetries
Markov models & gene finders
Hybrid models & neural network parameterization of graphical models
The single-model case
Bidirectional recurrent neural networks for protein secondary structure prediction
X. Probabilistic Models of Evolution: Phylogenetic TreesIntroduction to probabilistic models of evolution
Substitution probabilities & evolutionary rates
Rates of evolution
Data likelihood
Optimal trees & learning
Parsimony
Extensions
XI. Stochastic Grammars and LinguisticsIntroduction to formal grammars
Formal grammars & the Chomsky hierarchy
Applications of grammars to biological sequences
Prior information & initialization
Likelihood
Learning algorithms
Applications of SCFGs
Experiments
Future directions
XII. Microarrays & Gene Expression Introduction to microarray data
Probabilistic modeling of array data
Clustering
Gene Regulation
XIII. Internet Resources & Public Databases
A rapidly changing set of resources Databases over databases and tools
Databases over databases in molecular biology
Sequence & structure databases
Sequence similarity searches
Alignment
Selected prediction servers
Molecular biology software links
Ph.D. courses over the Internet
Bioinformatics societies
HMM/NN simulator
Textbook's Appendix
A. Statistics
Decision theory & loss functions
Quadratic loss functions
The bias/variance trade-off
Combining estimators
Error bars
Sufficient statistics
Exponential family
Additional useful distributions
Variational methodsB. Information Theory, Entropy, & Relative Entropy
Entropy
Relative Entropy
Mutual Information
Jensen's Inequality
Maximum Entropy
Minimum Relative EntropyC. Probabilistic Graphical Models
Notation & preliminaries
The undirected case: Markov random fields
The directed case: Bayesian networks
D. HMM Technicalities, Scaling, Periodic Architectures, State Functions, and Dirichlet Mixtures
Scaling
Periodic architectures
State functions: bendability
Dirichlet mixtures
E. Gaussian Processes, Kernel Methods, and Support Vector Machines
Gaussian process models
Kernel methods & support vector machines
Theorems for Gaussian processes & SVMs
Main page
Table of Content
HW, Quiz and Exams
Syllabus
Policy