Bioinformatics and Statistical Genetics

Teachers
Weekly hours
Competences
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Previous capacities

Credits

6

Types

MIRI: Specialization complementary (Data Science)
MDS: Elective

Requirements

This subject has not requirements, but it has got previous capacities

Department

CS;EIO

Statistical Genetics and Epidemiology

Teachers

Person in charge

Marta Janira Castellano Palomino ( )

Others

Cristian Tebe Cordomi ( )

Weekly hours

Theory

1

Problems

0

Laboratory

2

Guided learning

0

Autonomous learning

7

Competences

Transversal Competences

Information literacy

CT4 - Capacity for managing the acquisition, the structuring, analysis and visualization of data and information in the field of specialisation, and for critically assessing the results of this management.

Third language

CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.

Basic

CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
CB7 - Ability to integrate knowledge and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
CB10 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context.

Generic Technical Competences

Generic

CG4 - Design and implement data science projects in specific domains and in an innovative way

Technical Competences

Especifics

CE1 - Develop efficient algorithms based on the knowledge and understanding of the computational complexity theory and considering the main data structures within the scope of data science
CE2 - Apply the fundamentals of data management and processing to a data science problem
CE5 - Model, design, and implement complex data systems, including data visualization
CE6 - Design the Data Science process and apply scientific methodologies to obtain conclusions about populations and make decisions accordingly, from both structured and unstructured data and potentially stored in heterogeneous formats.
CE9 - Apply appropriate methods for the analysis of non-traditional data formats, such as processes and graphs, within the scope of data science

Objectives

Introduce the student to the algorithmic, computational, and statistical problems that arise in the analysis of biological data.
Related competences: CT4, CT5, CG4, CE5, CE6, CE9, CB6, CB7, CB10,
Reinforce the knowledge of discrete structures, algorithmic techniques, and statistical techniques that the student may have from previous courses.
Related competences: CT5, CE1, CE2, CE9,

Introduction to statistical genetics
Basic terminology, haplotype definition, SNP, STN, descriptive statistics
Hardy-Weinberg equilibrium
Hardy-Weinberg law. Hardy-Weinberg assumptions. Multiple alleles. Statistical tests for Hardy-Weinberg equilibrium: chi-square, exact and likelihood-ratio tests. Graphical representations. Disequilibrium coefficients: the inbreeding coefficient, Weir's D. R-package HardyWeinberg.
Linkage disequilibrium and Phase estimation
Definition of linkage disequilibrium (LD). Measures for LD. Estimation of LD by maximum likelihood. Haplotypes. The HapMap project. Graphics for LD. The LD heatmap. Phase ambiguity for double heterozygotes. Phase estimation with the EM algorithm. Estimation of haplotype frequencies. R-package haplo.stats.
Population substructure
Definition of population substructure. Population substructure and Hardy-Weinberg equilibrium. Population substructure and LD. Statistical methods for detecting substructure. Multidimensional scaling. Metric and non-metric multidimensional scaling. Euclidean distance matrices. Stress. Graphical representations.
Family relationships and allele sharing
Identity by state (IBS) and Identity by descent (IBD). Kinship coefficients. Allele sharing. Detection of family relationships. Graphical representations.
Genetic association analysis
Disease-marker association studies. Genetic models: dominant, co-dominant and recessive models. Testing models with chi-square tests. The alleles test and the Cochran-Armitage trend test. Genome-wide assocation tests.
Introduction to Epidemiology
To define epidemiology, understand its core principles, and appreciate its relevance in public health.
Measures of Disease Frequency
To understand and calculate various measures used to quantify disease occurrence in populations.
Analytical Study Designs and Their Core Measures I
To understand the major analytical study designs and the primary measures of association and effect derived from them.
Analytical Study Designs and Their Core Measures II
To understand the major analytical study designs and the primary measures of association and effect derived from them.
Bias, Confounding, and Causality
To understand potential threats to validity in epidemiological studies and the criteria for establishing causality.
Introduction to Risk Assessment
To define risk assessment, understand its framework, and appreciate its role in public health decision-making
Applications and Future Directions
To review practical applications of epidemiology and risk assessment and discuss emerging challenges

Activities

Activity Evaluation act

Development of syllabus topics

Objectives: 1 2
Contents:

2 . Hardy-Weinberg equilibrium
3 . Linkage disequilibrium and Phase estimation
4 . Population substructure
6 . Genetic association analysis
5 . Family relationships and allele sharing

Theory

15h

Problems

0h

Laboratory

24h

Guided learning

0h

Autonomous learning

75h

Final exam Epidemiology

Objectives: 1 2
Week: 18 (Outside class hours)

Theory

0h

Problems

0h

Laboratory

3h

Guided learning

0h

Autonomous learning

15h

Final exam Statistical Genetics

Objectives: 1 2
Week: 9 (Outside class hours)

Theory

0h

Problems

0h

Laboratory

3h

Guided learning

0h

Autonomous learning

15h

Teaching methodology

All classes consist of a theoretical session (a lecture in which the professor introduces new concepts or techniques and detailed examples illustrating them) followed by a practical session (in which the students work on the examples and exercises proposed in the lecture). On the average, two hours a week are dedicated to theory and one hour a week to practice, and the professor allocates them according to the subject matter. Students are required to take an active part in class and to submit the exercises at the end of each class.

Evaluation methodology

For the first half (Statistical Genetics), students are evaluated in a mid-term exam. Every student is required to submit one exercise each week, graded from 0 to 10, and the grade for the first part consists of 30% for the exercises and 70% for the mid-term exam, also graded from 0 to 10. In the second half (Epidemiology), students are evaluated during class, and in a final exam. The final grade of the lecture is made from 50% of the Statistical Genetics and 50% of the grade in Epidemiology.

Bibliography

Basic:

Integer linear programming in computational and systems biology : an entry-level text and course - Gusfield, Dan, Cambridge University Press, [2019]. ISBN: 9781108421768
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004172889706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Applied Statistical Genetics with R: For Population-based Association Studies - Foulkes, Andrea S, Springer, 2009. ISBN: 9780387895536
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003963689706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
The Fundamentals of modern statistical genetics - Laird, Nan M.; Lange, Christoph, Springer, 2011. ISBN: 9781461427759
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003963669706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Complementary:

Optimization Approaches for Solving String Selection Problems [Recurs electrònic] - Pappalardo, Elisa; Pardalos, P. M; Stracquadanio, Giovanni, Springer , 2013. ISBN: 9781461490531
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=1538891
Genetic data analysis II: methods for discrete population genetic data - Weir, B.S, Sinauer Associates , 1996. ISBN: 0878939024
http://cataleg.upc.edu/record=b1433568~S1*cat
Statistical Approach to Genetic Epidemiology - Ziegler, Andreas; König, Inke R., Wiley , 2011. ISBN: 9783527633654

Previous capacities

Basic knowledge of algorithms and data structures.
Basic knowledge of statistics.
Basic knowledge of the Python programming language.
Basic knowledge of the R programming language.

Bioinformatics and Statistical Genetics

Teachers

Person in charge

Others

Weekly hours

Competences

Transversal Competences

Information literacy

Third language

Basic

Generic Technical Competences

Generic

Technical Competences

Especifics

Objectives

Contents

Activities

Development of syllabus topics

Final exam Epidemiology

Final exam Statistical Genetics

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Previous capacities

Where we are

Contact with us

Bioinformatics and Statistical Genetics

You are here

Teachers

Person in charge

Others

Weekly hours

Competences

Transversal Competences

Information literacy

Third language

Basic

Generic Technical Competences

Generic

Technical Competences

Especifics

Objectives

Contents

Activities

Development of syllabus topics

Final exam Epidemiology

Final exam Statistical Genetics

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Previous capacities

Where we are

Contact with us