Algorithms in Biology

Teachers
Weekly hours
Learning Outcomes
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Previous capacities

Credits

Types

Compulsory

Requirements

This subject has not requirements, but it has got previous capacities

Department

UPF;UAB

This course presents the fundamentals of sequence analysis of biological sequence data, from the basic algorithms to their main applications.

The subject consists in three main blocks:
- Dynamic programming and Sequence alignment: Dynamic programming. Pairwise alignment (Needleman-Wunsch and Smith-Waterman algorithms). BLAST. Multiple sequence alignment. Other applications.
- Genomic data analysis: Sequencing Technologies. Computational genomics. Main file formats for sequence data. Approximate string matching aligners for sequencing reads. Genome assembly algorithms and strategies.
- Clustering Methods and Algorithms in Genomics: Hidden-Markov Models (HMM). Principal Component Analysis (PCA), Parsimony. Maximum Likelihood Methods. Genetic Algorithms.

The programming language used in this course is Python with special emphasis on solving applied genomics and clustering problems. Following a problem-based learning approach, the students will write their own scripts and/or use pre-existing bioinformatic approaches for different challenges. We will encourage the use of python libraries (for statistics and plots) and classes.

Teachers

Person in charge

Arnau Cordomí Montoya ( )

Others

Donate Weghorn ( )
Emanuele Raineri ( )
Oscar Lao Grueso ( )

Weekly hours

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Learning Outcomes

Knowledge

K1 - Recognize the basic principles of biology, from cellular to organism scale, and how these are related to current knowledge in the fields of bioinformatics, data analysis, and machine learning; thus achieving an interdisciplinary vision with special emphasis on biomedical applications.
K2 - Identify mathematical models and statistical and computational methods that allow for solving problems in the fields of molecular biology, genomics, medical research, and population genetics.
K4 - Integrate the concepts offered by the most widely used programming languages in the field of Life Sciences to model and optimize data structures and build efficient algorithms, relating them to each other and to their application cases.
K7 - Analyze the sources of scientific information, valid and reliable, to justify the state of the art of a bioinformatics problem and to be able to address its resolution.

Skills

S1 - Integrate omics and clinical data to gain a greater understanding and a better analysis of biological phenomena.
S2 - Computationally analyze DNA, RNA and protein sequences, including comparative genome analyses, using computation, mathematics and statistics as basic tools of bioinformatics.
S3 - Solve problems in the fields of molecular biology, genomics, medical research and population genetics by applying statistical and computational methods and mathematical models.
S4 - Develop specific tools that enable solving problems on the interpretation of biological and biomedical data, including complex visualizations.
S5 - Disseminate information, ideas, problems and solutions from bioinformatics and computational biology to a general audience.
S7 - Implement programming methods and data analysis based on the development of working hypotheses within the area of study.
S8 - Make decisions, and defend them with arguments, in the resolution of problems in the areas of biology, as well as, within the appropriate fields, health sciences, computer sciences and experimental sciences.

Competences

C2 - Identify the complexity of the economic and social phenomena typical of the welfare society and relate welfare to globalization, sustainability and climate change in order to use technique, technology, economy and sustainability in a balanced and compatible way.
C3 - Communicate orally and in writing with others in the English language about learning, thinking and decision making outcomes.
C4 - Work as a member of an interdisciplinary team, either as an additional member or performing managerial tasks, in order to contribute to the development of projects (including business or research) with pragmatism and a sense of responsibility and ethical principles, assuming commitments taking into account the available resources.

Objectives

Present their work in front of their coleagues
Related competences: C3,
Collaborate with other students to conduct a project assignment
Related competences: C4,
Development of mathematical models for working with biological sequences during the practical assignments using Phyton programming language. Different tools will be provided for visualizing the results.
Related competences: K2, K4, K7, S1, S2, S3, S4, S5, S7, S8,
Generating optimal programming skills for minimizing computational time and the fingerprint of global climate change
Related competences: C2,
Understanding how sequence alignment and phylogenetics can be applied to medicine.
Related competences: K1,

Theoretical Contents
T1. Introduction to sequence alignment
T2. Scoring functions
T3. Global and Local Pairwise Sequence Alignment (Dynamic Programming)
T4. Basic Local Alignment Tool (BLAST)
T5. Advanced dynamic programming
T6. Multiple Sequence Alignment
T7. Sequencing Technologies and Computational Genomics Foundations
T8. Short Read Alignment and Compressed Indexing
T9. Genome Assembly Algorithms
T10. Introduction to Phylogenetic Trees and Algorithms
T11. Distance-Based Methods
T12. Character-Based Methods

Activities

Activity Evaluation act

Introduction to sequence alignment

Objectives: 3
Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

BLAST

Problems: 2 Groups of Students

Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Approximate string matching aligners for short reads. Fundamentals of Burrow-Wheeler Transform. Introduction Long read alignment.

Approximate string matching aligners for short reads. Fundamentals of Burrow-Wheeler Transform. Introduction Long read alignment.
Objectives: 3
Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

De novo genome assembly. Short read assembly: Debruijn graph and Overlap-layout consensus. Long Read and Hybrid Assembly. Scaffolding.

De novo genome assembly. Short read assembly: Debruijn graph and Overlap-layout consensus. Long Read and Hybrid Assembly. Scaffolding.
Objectives: 3
Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Sequencing Technologies and Computational Genomics Foundations

Objectives: 3 4
Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Basics of Phylogenetics. Basic Algorithms in Phylogenetics.

Basics of Phylogenetics. Basic Algorithms in Phylogenetics.
Objectives: 5

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Phylogenetics Distance based methods.

Phylogenetics Distance based methods.
Objectives: 3 5
Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Character based methods. Parsimony, maximum likelihood & Bayesian Phylogenetics.

Character based methods. Parsimony, maximum likelihood & Bayesian Phylogenetics.
Objectives: 3 4 5
Contents:

1 . Theoretical Contents

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Group project in algorithms and Bioinformatic applications.

Group project in algorithms and Bioinformatic applications.
Objectives: 1 2 4
Contents:

1 . Theoretical Contents

Theory

2.4h

Problems

2.4h

Laboratory

Guided learning

Autonomous learning

18h

Funcions de puntuació

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Global and Local Pairwise Sequence Alignment

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Advanced dynamic programming

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Multiple Sequence Alignment

Theory

2.3h

Problems

2.3h

Laboratory

Guided learning

Autonomous learning

Teaching methodology

Problem-based learning approach:

- Theoretical lectures.
- Practical programming exercises directly related to theory.
- Group project in algorithms and Bioinformatic applications.

Evaluation methodology

- Continuous Assessment (CA) ¿ 20%: Quizzes and submission of exercises.

- Group Project (GP) 20%: Assessed using a rubric that will be published on the course Moodle page.

- Exams 60%: Mid-term Exam (ME) 30%, Final Exam (FE) 30%. Evaluation rubrics for the exams will be published on the course Moodle page.

- Retake: Consists of two exams (E1 and E2), corresponding to each subject block. The final grade after the retake will be calculated as: 20% CA + 20% GP + 30% max(ME, E1) + 30% max(FE, E2). + 30% max(ME, E1) + 30% max(FE, E2).

Bibliography

Basic:

Biological sequence analysis : probabilistic models of proteins and nucleic acids - Durbin, Richard... [et al.], Cambridge University Press, 1998. ISBN: 0521629713
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=320915
Bioinformatics algorithms: an active learning approach - Compeau, Phillip P; Pevzner, Pavel., Active Learning Publishers., 2015. ISBN: 9780990374619
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004091329706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Problems and Solutions in Biological Sequence Analysis - Borodovsky, Mark; Ekisheva, Svetlana, Cambridge University Press, 2006. ISBN: 9780521612302
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004123449706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing - Lemey,P; Salemi, M; Vandamme, A, Cambridge University Press, 2009. ISBN: 9786612539510
https://www-cambridge-org.recursos.biblioteca.upc.edu/core/books/phylogenetic-handbook/A9D63A454E76A5EBCCF1119B3C56D766

Previous capacities

Applied Programming I, II and III

Algorithms in Biology

Teachers

Person in charge

Others

Weekly hours

Learning Outcomes

Learning Outcomes

Knowledge

Skills

Competences

Objectives

Contents

Activities

Introduction to sequence alignment

BLAST

Approximate string matching aligners for short reads. Fundamentals of Burrow-Wheeler Transform. Introduction Long read alignment.

De novo genome assembly. Short read assembly: Debruijn graph and Overlap-layout consensus. Long Read and Hybrid Assembly. Scaffolding.

Sequencing Technologies and Computational Genomics Foundations

Basics of Phylogenetics. Basic Algorithms in Phylogenetics.

Phylogenetics Distance based methods.

Character based methods. Parsimony, maximum likelihood & Bayesian Phylogenetics.

Group project in algorithms and Bioinformatic applications.

Funcions de puntuació

Global and Local Pairwise Sequence Alignment

Advanced dynamic programming

Multiple Sequence Alignment

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Previous capacities

Where we are

Contact with us

Algorithms in Biology

You are here

Teachers

Person in charge

Others

Weekly hours

Learning Outcomes

Learning Outcomes

Knowledge

Skills

Competences

Objectives

Contents

Activities

Introduction to sequence alignment

BLAST

Approximate string matching aligners for short reads. Fundamentals of Burrow-Wheeler Transform. Introduction Long read alignment.

De novo genome assembly. Short read assembly: Debruijn graph and Overlap-layout consensus. Long Read and Hybrid Assembly. Scaffolding.

Sequencing Technologies and Computational Genomics Foundations

Basics of Phylogenetics. Basic Algorithms in Phylogenetics.

Phylogenetics Distance based methods.

Character based methods. Parsimony, maximum likelihood & Bayesian Phylogenetics.

Group project in algorithms and Bioinformatic applications.

Funcions de puntuació

Global and Local Pairwise Sequence Alignment

Advanced dynamic programming

Multiple Sequence Alignment

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Previous capacities

Where we are

Contact with us