Bioinformatics and Statistical Genetics

Weekly hours
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Web links
Previous capacities

Credits

6

Types

Elective

Requirements

This subject has not requirements, but it has got previous capacities

Department

EIO;CS

Bioinformatics and Statistical Genetics

Teachers

Person in charge

Gabriel Valiente Feruglio ( )

Weekly hours

Theory

1

Problems

0

Laboratory

2

Guided learning

0

Autonomous learning

105

Objectives

Introduce the student to the algorithmic, computational, and statistical problems that arise in the analysis of biological data.
Related competences: CB6, CB7, CB9, CTR6, CEC1, CEC2, CEC3, CG3,
Reinforce the knowledge of discrete structures, algorithmic techniques, and statistical techniques that the student may have from previous courses.
Related competences: CB6, CB7, CB9, CTR6, CEC1, CEC2, CEC3, CG3,

Introduction to bioinformatics
Computational biology and bioinformatics. Algorithms in bioinformatics. Strings, sequences, trees, and graphs. Algorithms on strings and sequences. Representation of trees and graphs. Algorithms on trees and graphs.
Phylogenetic reconstruction I
Character-based phylogenetic reconstruction. Compatibility. Perfect phylogenies. Distance-based phylogenetic reconstruction. Additive trees. Ultrametric trees.
Agreement of phylogenetic trees
Partition distance. Triplets distance. Quartets distance. Transposition distance. Edit distance and alignment of phylogenetic trees.
Phylogenetic reconstruction II
Phylogenetic networks. Galled trees. Tree-child networks. Tree-sibling networks. Time consistency of phylogenetic networks.
Agreement of phylogenetic networks
Path multiplicity distance. Tripartition distance. Nodal distance. Triplets distance. Edit distance and alignment of phylogenetic networks.
Phylogenetic reconstruction III
Mutation trees. Clonal trees. Clonal deconvolution.
Phylogenetic and taxonomic reconstruction
Phylogenies and taxonomies. Classification of metagenomic samples. Agreement of classifications.
Introduction to statistical genetics
Basic genetic terminology. Population-based and family-based studies. Traits, markers and polymorphisms. Single nucleotide polymorphisms and microsatellites. R-package genetics.
Hardy-Weinberg equilibrium
Hardy-Weinberg law. Hardy-Weinberg assumptions. Multiple alleles. Statistical tests for Hardy-Weinberg equilibrium: chi-square, exact and likelihood-ratio tests. Graphical representations. Disequilibrium coefficients: the inbreeding coefficient, Weir's D. R-package HardyWeinberg.
Linkage disequilibrium
Definition of linkage disequilibrium (LD). Measures for LD. Estimation of LD by maximum likelihood. Haplotypes. The HapMap project. Graphics for LD. The LD heatmap.
Phase estimation
Phase ambiguity for double heterozygotes. Phase estimation with the EM algorithm. Estimation of haplotype frequencies. R-package haplo.stats.
Population substructure
Definition of population substructure. Population substructure and Hardy-Weinberg equilibrium. Population substructure and LD. Statistical methods for detecting substructure. Multidimensional scaling. Metric and non-metric multidimensional scaling. Euclidean distance matrices. Stress. Graphical representations.
Genetic association analysis
Disease-marker association studies. Genetic models: dominant, co-dominant and recessive models. Testing models with chi-square tests. The alleles test and the Cochran-Armitage trend test. Genome-wide assocation tests.
Family relationships and allele sharing
Identity by state (IBS) and Identity by descent (IBD). Kinship coefficients. Allele sharing. Detection of family relationships. Graphical representations.

Activities

Activity Evaluation act

Development of syllabus topics

Objectives: 1 2
Contents:

1 . Introduction to bioinformatics
2 . Phylogenetic reconstruction I
3 . Agreement of phylogenetic trees
4 . Phylogenetic reconstruction II
5 . Agreement of phylogenetic networks
6 . Phylogenetic reconstruction III
7 . Phylogenetic and taxonomic reconstruction
8 . Introduction to statistical genetics
9 . Hardy-Weinberg equilibrium
10 . Linkage disequilibrium
11 . Phase estimation
12 . Population substructure
13 . Genetic association analysis
14 . Family relationships and allele sharing

Theory

15h

Problems

0h

Laboratory

30h

Guided learning

0h

Autonomous learning

72h

Final exam

Theory

3h

Problems

0h

Laboratory

0h

Guided learning

0h

Autonomous learning

30h

Teaching methodology

All classes consist of a theoretical session (a lecture in which the professor introduces new concepts or techniques and detailed examples illustrating them) followed by a practical session (in which the students work on the examples and exercises proposed in the lecture). On the average, two hours a week are dedicated to theory and one hour a week to practice, and the professor allocates them according to the subject matter. Students are required to take an active part in class and to submit the exercises at the end of each class.

Evaluation methodology

Students are evaluated during class, and in a final exam. Every student is required to submit one exercise each week, graded from 0 to 10, and the final grade consists of 50% for the exercises and 50% for the final exam, also graded from 0 to 10.

Bibliography

Basic:

Algorithms on trees and graphs - Valiente, Gabriel, Springer Nature, 2021. ISBN: 9783030818845
http://cataleg.upc.edu/record=99100491644060671~S1*cat
Combinatorial pattern matching algorithms in computational biology using Perl and R - Valiente, Gabriel, Chapman and Hall/CRC, 2009. ISBN: 9781420069730
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003632209706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Applied statistical genetics with R : for population-based association studies - Foulkes, Andrea S, Springer, 2009. ISBN: 9780387895536
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003963689706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
The Fundamentals of Modern Statistical Genetics - Laird, Nan M.; Lange, Christoph, Springer, 2011. ISBN: 9781441973375
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003963669706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Complementary:

Algorithms on strings, trees, and sequences : computer science and computational biology - Gusfield, Dan, Cambridge University Press , 1997. ISBN: 0521585198
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001989459706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Analysis of phylogenetics and evolution with R - Paradis, Emmanuel, Springer , 2012. ISBN: 9781461417439
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001344299706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Genetic data analysis II: methods for discrete population genetic data - Weir, B.S, Sinauer Associates , 1996. ISBN: 0878939024
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004009379706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Statistical Approach to Genetic Epidemiology - Ziegler, Andreas; König, Inke R., Wiley-VCH , 2011. ISBN: 9783527633654
http://cataleg.upc.edu/record=99100491924750671~S1*cat

Web links

Rosalind http://rosalind.info/
The R Project for Statistical Computing http://www.r-project.org/

Previous capacities

Basic knowledge of algorithms and data structures.
Basic knowledge of statistics.
Basic knowledge of the Python programming language.
Basic knowledge of the R programming language.

Bioinformatics and Statistical Genetics

Teachers

Person in charge

Weekly hours

Objectives

Contents

Activities

Development of syllabus topics

Final exam

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Where we are

Contact with us

Bioinformatics and Statistical Genetics

You are here

Teachers

Person in charge

Weekly hours

Objectives

Contents

Activities

Development of syllabus topics

Final exam

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Where we are

Contact with us