This is a course on Bioinformatics that aims at exposing the students to some advanced statistical and computational techniques related to bioinformatics. This course would prepare the students in understanding bioinformatics principles and their applications.
- Genomic databases and analysis of high-throughput data sets, Analysis of DNA sequence, Sequence annotation, ESTs, SNPs. BLAST and related sequence comparison methods. EM algorithm and other statistical methods to discover common motifs in biosequences. Multiple alignment and database search using motif models, ClustalW and others. Concepts in phylogeny. Gene prediction based on codons, Decision trees, Classificatory analysis, Neural Networks, Genetic algorithms, Pattern recognition, Hidden Markov models.
- Computational analysis of protein sequence, structure and function. Modeling protein families. Expression profiling by microarray/gene chip, proteomics etc., Multiple alignment of protein sequences, Modeling and prediction of structure of proteins, Designer proteins, Drug designing.
- Markov chains (MC with no absorbing states; Higher order Markov dependence; patterns in sequences; Markov chain Monte Carlo – Hastings-Metropolis algorithm, Simulated Annealing, MC with absorbing States), Bayesian techniques and use of Gibbs Sampling, Advanced topics in design and Analysis of DNA microarray experiments.
- Computationally intensive methods (Classical estimation methods, Bootstrap estimation and Confidence Intervals, Hypothesis testing, Multiple Hypothesis testing), Evolutionary models (Models of Nucleotide substitution), Phylogenetic tree estimation (Distances: Tree reconstruction – Ultrametric and Neighbor-Joining cases, Surrogate distances, Tree reconstruction, Parsimony and Maximum Likelihood, Modeling, Estimation and Hypothesis Testing), Neural Networks (Universal Approximation Properties, Priors and Likelihoods, Learning Algorithms – Back propagation, Sequence encoding and output interpretation, Prediction of Protein Secondary Structure, Prediction of Signal Peptides and their cleavage sites, Application for DNA and RNA Nucleotide Sequences), Analysis of SNPs and Haplotypes.
Genomic databases and analysis of high-throughput data sets, BLAST and related sequence comparison methods, Statistical methods to discover common motifs in biosequences, Multiple alignment and database search using motif models, ClustalW, Classificatory analysis, Neural Networks, Genetic algorithms, Pattern recognition, Hidden Markov models, Computational analysis of protein sequence, Expression profiling by microarray/gene chip, proteomics, Modelling and prediction of structure of proteins, Bayesian techniques and use of Gibbs Sampling, Analysis of DNA microarray experiments, Analysis of one DNA sequence, Analysis of multiple DNA or protein sequences, Computationally intensive methods, Multiple Hypothesis testing, Phylogenetic tree estimation, Analysis of SNPs and Haplotypes.