Supercomputing for Challenging Applications

You are here

Credits
6
Types
Specialization complementary (High Performance Computing)
Requirements
This subject has not requirements

Department
AC
This course covers the exploitation of supercomputing for a variety of scientific and engineering applications. The contents span several application domains both in the numerical and the non-numerical areas.

Teachers

Person in charge

  • Daniel Jimenez Gonzalez ( )

Others

  • Carlos Alvarez Martinez ( )
  • Josep Larriba Pey ( )

Weekly hours

Theory
4
Problems
0
Laboratory
0
Guided learning
0.15
Autonomous learning
7.4

Competences

Technical Competences of each Specialization

Advanced computing

  • CEE3.3 - Capability to understand the computational requirements of problems from non-informatics disciplines and to make significant contributions in multidisciplinary teams that use computing.

High performance computing

  • CEE4.2 - Capability to analyze, evaluate, design and optimize software considering the architecture and to propose new optimization techniques.

Specific

  • CEC3 - Ability to apply innovative solutions and make progress in the knowledge that exploit the new paradigms of Informatics, particularly in distributed environments.

Generic Technical Competences

Generic

  • CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.
  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.
  • CG5 - Capability to apply innovative solutions and make progress in the knowledge to exploit the new paradigms of computing, particularly in distributed environments.

Objectives

  1. The student should be able to understand the complexity of different algorithms, identify the computationally intensive parts of a simulation or data processing, and decide which parts need to be optimized and parallelized.
    Related competences: CG1, CG3, CEE3.3,
  2. The student must be able to dessign and implement efficient parallel simulation and data processing algorithms using a parallel programing model.
    Related competences: CG1, CEE4.2, CG5, CEC3,
  3. The student must be able to evaluate the different tradeoffs (robustness, computational cost, scalability) in order to select a specific algorithm for a simulation or data processing problem
    Related competences: CG1, CEE3.3,

Contents

  1. Introduction
    Introduction: Overview
    • Challenges in Science and Engineering
    • HPC = Algorithms + Architecture + Programming Model
    • Numerical and Non-Numerical Applications
    • Parallel Computers
    • Parallel Programming Models
  2. Introduction to Numerical Simulations
    • From Models to Algorithms
    • Discretization and PDEs
    • Finite differences and Finite Elements
    • Types of PDEs: Elliptic, Parabolic, Hyperbolic
    • Initial-value and Boundary-value Problems
    • Numerical schemes: Explicit vs. Implicit
    • Stencils: Common Patterns
    • Sparse Matrices and their Applications
    • Numerical Software: From BLAS and LAPACK to Trilinos and PETSC
  3. Solving Large-Scale Linear Systems of Equations
    • Direct vs Iterative Methods
    • Fundamental Inner kernels and Matrix Formats
    • Preconditioning
    • Multigrid Methods
    • Hierarchy of discretizations
    • V-cycle iterations
    • Partitioning and Reordering
    • Local and global approaches: From RCM and MMD to ParMetis and PT-SCOTCH
  4. Practical Cases
    . Computational Fluid Dynamics (CFD)
    . Wave Propagation
  5. Introduction to the kernel of DBMSs and the execution of queries in such systems
    This topic has the objective to understand the different software layers of a DBMS and how they interact, the different complexities that they incarnate and how their interaction determines the performance of such DBMSs. The sessions will be including discussions and the preparation of papers by the students.
  6. Big data kernels and graph databases
    This session has the objective to introduce the students to the design of such Big-data and graph database systems, how they are built and how their performance is influenced by the structure of their different software layers. This sessions will be the base for those students who select this part of the course for the practical assignment.
  7. Benchmarking for Databases
    Benchmarking is one of the most important issues in Database design and evolution. This part of the course will be designed to understand the different efforts being carried in benchmarking from the USA and Europe for relational and graph databases. This sessions will be the base for those students who select this part of the course for the practical assignment.
  8. Sequence Alignment
    This is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
    Search, scoring and parallel strategies are important to overcome this challenge.
  9. Molecular dynamics
    This consists in a computer simulation of physical movements of atoms and molecules. Cutoff techniques may be important to reduce the computing complexity of this challenge.
  10. Protein-Protein Docking
    This is a method which predicts the preferred orientation of one molecule to a second when bound to each other to form a stable complex. There are serveral approaches that may increase the accuracy but also de complexity ot those methods.

Activities

Introduction

Follow the lectures, study the materials and practise.
Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
0
  • Theory: Overview • Challenges in Science and Engineering • HPC = Algorithms + Architecture + Programming Model • Numerical and Non-Numerical Applications • Parallel Computers • Parallel Programming Models
  • Autonomous learning: Study the associated contents and work on the assignments.
Objectives: 1
Contents:

Laboratory Environment and Brief explanation of the tools to be used

Theory
4
Problems
0
Laboratory
0
Guided learning
2
Autonomous learning
8

Part I: Numerical Applications

Follow the lectures, study the materials and practise.
Theory
16
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
32
  • Theory: 1. Introduction to Numerical Simulations • From Models to Algorithms • Discretization and PDEs • Finite differences and Finite Elements • Types of PDEs: Elliptic, Parabolic, Hyperbolic • Initial-value and Boundary-value Problems • Numerical schemes: Explicit vs. Implicit • Stencils: Common Patterns • Sparse Matrices and their Applications • Numerical Software: From BLAS and LAPACK to Trilinos and PETSC 2. Solving Large-Scale Linear Systems of Equations • Direct vs Iterative Methods • Fundamental Inner kernels and Matrix Formats • Preconditioning • Multigrid Methods • Hierarchy of discretizations • V-cycle iterations • Partitioning and Reordering • Local and global approaches: From RCM and MMD to ParMetis and PT-SCOTCH 3. Practical Cases . Computational Fluid Dynamics (CFD) . Wave Propagation
Objectives: 1 2 3
Contents:

Part II-a) Non-Numerical Applications: Big-Data Management

Follow the lectures, study the materials and practise.
Theory
18
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
24
  • Theory: 1. Introduction to the kernel of DBMSs and the execution of queries in such systems (1 week). This topic has the objective to understand the different software layers of a DBMS and how they interact, the different complexities that they incarnate and how their interaction determines the performance of such DBMSs. The sessions will be including discussions and the preparation of papers by the students. 2. Big data kernels and graph databases (1 week). This session has the objective to introduce the students to the design of such Big-data and graph database systems, how they are built and how their performance is influenced by the structure of their different software layers. This sessions will be the base for those students who select this part of the course for the practical assignment. 3. Benchmarking for Databases (1 week). Benchmarking is one of the most important issues in Database design and evolution. This part of the course will be designed to understand the different efforts being carried in benchmarking from the USA and Europe for relational and graph databases. This sessions will be the base for those students who select this part of the course for the practical assignment.
Objectives: 1 2 3
Contents:

Part II-b) Non-Numerical Applications: Bio-Informatics

Follow the lectures, study the materials and practise.
Theory
12
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
32
  • Theory: In the Bioinformatic part we will work with three important challenges: Sequence Alignment, Molecular Dynamics and Protein-Protein Docking. 1) Sequence Alignment (1 week): [is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.] Search, scoring and parallel strategies are important to overcome this challenge. 2) Molecular dynamics (1 week): [is a computer simulation of physical movements of atoms and molecules.] Cutoff techniques may be important to reduce the computing complexity of this challenge. 3) Protein-Protein Docking (1 week): [is a method which predicts the preferred orientation of one molecule to a second when bound to each other to form a stable complex.] There are serveral approaches that may increase the accuracy but also de complexity ot those methods.
  • Autonomous learning: Study the associated contents and work on the assignments.
Objectives: 1 2 3
Contents:

Teaching methodology

During the course there will be two types of activities:

a) Activities focused on the acquisition of theoretical knowledge.
b) Activities focused on the acquisition of knowledge through experimentation by implementing and evaluating empirically in the laboratory the mechanisms explained at a theoretical level.

The theoretical activities include participatory lecture classes, which explain the basic contents of the course. The practical activities include seminar laboratories using the student's laptop in class, where students implement the mechanisms described in the lectures. The seminars require a preparation by reading the statement and supporting documentation, and a further elaboration of the conclusions in a report.

Evaluation methodology

The course will be evaluated with a set of assignments and a final project.

Grade = A1/3 + A2/3 + 0.A3/3

Where

Ai := Assignment i (i from 1 to 3)

Bibliografy

Basic:

  • Finite Element Analysis: From concepts to applications - David S. Burnett, Addison-Wesley , 1987. ISBN: 0-201-10806-2
  • Numerical methods for engineers - Chapra, Steven C; Canale, Raymond P, McGraw-Hill , 2006. ISBN: 0071244298
    http://cataleg.upc.edu/record=b1274310~S1*cat
  • Iterative methods for sparse linear systems - Saad, Yousef, SIAM , 2003. ISBN: 0898715342
    http://cataleg.upc.edu/record=b1235160~S1*cat
  • Parallel programming in C with MPI and OpenMP - Quinn, Michael J, McGraw-Hill , 2003. ISBN: 0071232656
    http://cataleg.upc.edu/record=b1236051~S1*cat
  • A Multigrid tutorial - Briggs, William L; Henson, Van Emde; McCormick, S. F, Society for Industrial and Applied Mathematics , cop. 2000. ISBN: 0898714621
    http://cataleg.upc.edu/record=b1200088~S1*cat
  • The Ten Most Wanted Solutions in Protein Bioinformatics - Anna Tramontano , Chapman and Hall/CRC , 2005 . ISBN: 978-1-58488-491-0
  • Graph Data Management: Techniques and Applications - Sherif Sakr, Eric Pardede, IGI Global , 2011. ISBN: 9781613500538
  • Transaction processing : concepts and techniques - Gray, Ji; Reuter, Andrea, Morgan Kaufmann , cop. 1992. ISBN: 1558601902
    http://cataleg.upc.edu/record=b1075600~S1*cat

Complementary:

Web links

Previous capacities

Basic understanding of parallel architectures, including shared- and distributed-memory multiprocessor systems.

Useful programming skills of some parallel programming model.