Skip to main content

Applied Programming III

Credits
6
Types
Compulsory
Requirements
This subject has not requirements , but it has got previous capacities
Department
CS;UAB
The course presents basic algorithmic techniques that are applied in Bioinformatics problems, with a view on the strengths and limitations of these techniques. It also describes common data elements and formats used to represent biological data. During the course students will acquire the knowledge to deal with programming problems of a biological nature of small and medium complexity, making sensible choices of standard packages or pragmatic algorithmic implementations for specific problems.
At the end of the course, students:
1 will know the basic concepts of programming, algorithmics and information management in solving problems of a biological nature through computer programs.
2. will be able to use the main algorithmic schemes and some of their variants that frequently appear in common Bioinformatics problems,
3. will recognize the cases of application of the main methods used in Bioinformatics to access data stored in computers, with special attention to efficient mechanisms for sequence treatment.
4. will know how to integrate access to large biological databases with access to other local information structures and combine them appropriately with the necessary algorithmic concepts.
5. will know how to interface to external tools and use common libraries that extend functionality and improve performance of Python programs.
The programming language used in this course is Python, which will be complemented with the occasional use of tools from the Operating System or external applications.

Teachers

Person in charge

Others

Weekly hours

Theory
2
Problems
2
Laboratory
0
Guided learning
0
Autonomous learning
6

Competences

Knowledge

  • K3 - Identify the mathematical foundations, computational theories, algorithmic schemes and information organization principles applicable to the modeling of biological systems and to the efficient solution of bioinformatics problems through the design of computational tools.
  • K4 - Integrate the concepts offered by the most widely used programming languages in the field of Life Sciences to model and optimize data structures and build efficient algorithms, relating them to each other and to their application cases.
  • K5 - Identify the nature of the biological variables that need to be analyzed, as well as the mathematical models, algorithms, and statistical tests appropriate to develop and evaluate statistical analyses and computational tools.
  • Skills

  • S2 - Computationally analyze DNA, RNA and protein sequences, including comparative genome analyses, using computation, mathematics and statistics as basic tools of bioinformatics.
  • S7 - Implement programming methods and data analysis based on the development of working hypotheses within the area of study.
  • S8 - Make decisions, and defend them with arguments, in the resolution of problems in the areas of biology, as well as, within the appropriate fields, health sciences, computer sciences and experimental sciences.
  • Competences

  • C6 - Detect deficiencies in the own knowledge and overcome them through critical reflection and the choice of the best action to expand this knowledge.
  • Objectives

    1. Understand how to build a program and use additional tools to solve problems that use bioinformatics data.
      Related competences: K3, K4, S7, C6,
    2. Understand the format and semantics of basic data structures used to represent biological data: sequencies, genomes,...
      Related competences: K3, K5, S2, C6,
    3. Understand the most common operations that apply to bioinformatics data files and develop programs to perform them.
      Related competences: K3, K4, K5, S2, S7, C6,
    4. Understand basic algorithm principles that are used to solve sequence alignment and pattern matching problems.
      Related competences: K3, K4, K5, S2, S7, C6,
    5. Analyze solutions regarding time and memory cost and use programming components to improve performance.
      Related competences: K3, S7, S8, C6,

    Contents

    1. Introduction
      Python basics, flow control, functions, lists, dictionaries and structured data.
    2. Advanced data structures and string manipulation.
      Sequences, Strings, and the Genomic Data. Basic manipulation of genomic sequences, kmers and motifs.
    3. Iterators, Comprehensions and Generators
      Iterators and Comprehensions over collections. Principles and operation of Generators.
    4. Pattern searching and Regular Expressions
      Finding patterns. Finding patterns with Regular Expressions. Creating and matching regex objects.
    5. Files and exception management. Common bioinformatics file formats
      Basic operations with files. Exception management. Genomic data and common file formats in Bionformatics.
    6. Decorators and functional programming basics. map, filter, reduce, lambdas, zip, unzip, logging
      Definition and usage of decorators. Use of special functions: map, filter, reduce, lambdas, zip, unzip, logging
    7. Sequence Alignment: basic definitions and methods
      Alignment algorithms and dynamic programming. Alignment software and alignment statistics.
    8. Python extensions.
      Python modules for OS interaction and folder manipulation, program interfaces (command line arguments), Biopython
    9. Advanced data analysis
      Numpy, Pandas, Matplotlib

    Activities

    Activity Evaluation act


    Introduction

    Solving problems with Python
    Objectives: 1 2
    Contents:
    Theory
    2h
    Problems
    2h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    6h

    Advanced data structures and string manipulation

    Representation of sequences and genomic data. Common string manipulation actions: indexing, joining, slicing, searching, inserting
    Objectives: 1 2 3
    Contents:
    Theory
    2h
    Problems
    2h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    6h

    Pattern searching and Regular Expressions

    Finding patterns of text without regular expressions. Finding patterns with Regular Expressions. Using regex objects in Python.
    Objectives: 2 3 4
    Contents:
    Theory
    4h
    Problems
    4h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    12h

    Files and exception management.

    Working with files and using different formats that contain common bioinformatics information.
    Objectives: 1 2 3
    Contents:
    Theory
    2h
    Problems
    2h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    6h

    Decorators and functional programming basics

    Working with some advanced programming concepts that are related to functional programming methods
    Objectives: 1 2 3
    Contents:
    Theory
    2h
    Problems
    2h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    6h

    Sequence Alignment: basic definitions and methods.

    Working with basic algorithms to solve the sequence alignment problem.
    Objectives: 1 2 4 5
    Contents:
    Theory
    4h
    Problems
    4h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    12h

    Iterators, Comprehensions and Generators

    Using iterators, comprehensions and generators
    Objectives: 1 2 3 5
    Theory
    2h
    Problems
    2h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    6h

    Advanced data analysis

    Working with common modules that are used for the analysis of large datasets
    Objectives: 1 3 5
    Theory
    6h
    Problems
    6h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    18h

    Python extensions

    Working with common modules that extend the functionality of Python to interact with the outside world or to work with bioinformatics data.
    Objectives: 1 2 3 5
    Theory
    6h
    Problems
    6h
    Laboratory
    0h
    Guided learning
    0h
    Autonomous learning
    18h

    Teaching methodology

    During theoretical sessions, the professor will expose programming concepts, combined with examples and problem solving.
    During problem-solving sessions, students will work on their own solving problems on a computer system, under supervision and assistance of the professor when needed.

    Evaluation methodology

    There will be two exams: a mid-term exam and a final exam
    In addition, there will be some evaluable problem tests taken during problem sessions, announced in advance.
    FinalScore = 0.20*NP + 0.80*max(EF, 0.35*EP+0.65*EF)
    where:
    NP : Problem score. Short problem tests taken during problem sessions
    EP: Partial exam score
    EF: Final exam score
    Students who fail the subject can take the reevaluation exam ER. In this case, the grade of ER will substitute the grade of the final exam EF in the formua of the computation of the final grade above.

    Bibliography

    Basic

    Complementary

    Previous capacities

    Applied Programming I
    Applied Programming II
    Introduction to Bioinformatics