Supercomputing for Challenging Applications

Teachers
Weekly hours
Competences
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Web links
Previous capacities

Credits

Types

Specialization complementary (High Performance Computing)

Requirements

This subject has not requirements, but it has got previous capacities

Department

This course covers the exploitation of supercomputing for a variety of scientific and engineering applications. The contents span several application domains both in the numerical and the non-numerical areas.

Teachers

Person in charge

Carlos Alvarez Martinez ( )

Others

Daniel Jimenez Gonzalez ( )
Josep Larriba Pey ( )
Josep Ramon Herrero Zaragoza ( )

Weekly hours

Theory

Problems

Laboratory

Guided learning

0.15

Autonomous learning

7.4

Competences

Technical Competences of each Specialization

Advanced computing

CEE3.3 - Capability to understand the computational requirements of problems from non-informatics disciplines and to make significant contributions in multidisciplinary teams that use computing.

High performance computing

CEE4.2 - Capability to analyze, evaluate, design and optimize software considering the architecture and to propose new optimization techniques.

Specific

CEC3 - Ability to apply innovative solutions and make progress in the knowledge that exploit the new paradigms of Informatics, particularly in distributed environments.

Generic Technical Competences

Generic

CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.
CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.
CG5 - Capability to apply innovative solutions and make progress in the knowledge to exploit the new paradigms of computing, particularly in distributed environments.

Objectives

The student should be able to understand the complexity of different algorithms, identify the computationally intensive parts of a simulation or data processing, and decide which parts need to be optimized and parallelized.
Related competences: CG1, CG3, CEE3.3,
The student must be able to design and implement efficient parallel simulation and data processing algorithms using a parallel programing model.
Related competences: CG1, CEE4.2, CEC3, CG5,
The student must be able to evaluate the different tradeoffs (robustness, computational cost, scalability) in order to select a specific algorithm for a simulation or data processing problem
Related competences: CG1, CEE3.3,

Introduction
- Introduction: overview
- The modern scientific method
- Simulation and optimization
- HPC vs. HTC
- Numerical simulations
- Limits of parallelization
- Evolution and limits of HPC systems
Introduction to Numerical Simulations
- From Models to Algorithms
- Discretization and PDEs
- Types of PDEs: Elliptic, Parabolic, Hyperbolic
- From problem to math and solution
- Numerical schemes: Explicit vs. Implicit
- Finite differences and Finite Elements
Direct Solving of Large-Scale Linear Systems of Equations
- Triangular systems and parallelization
- Gauss elimination
- LU factorization
- Partitioning methods
- HPL
Iterative Solving of Large-Scale Linear Systems of Equations
- Direct vs iterative methods
- Jacobi
- Parallelization of iterative methods
- Gauss-Seidel and SOR
- Krylov methods and preconditioning: HPCG
- Software for numerical methods: BLAS, LAPACK, etc.
Numerical systems practical cases
- Parallel programming models
- Parallelism and granularity
- Block parallelism
- Sparse systems
ML problems introduction
- Introduction to the common operations of DNNs
- Most common DNN applications
- Architectures for DNNs
- Optimizations for DNNs
Sequence Alignment
This is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
Search, scoring and parallel strategies are important to overcome this challenge.
Practical Cases
- DNNs and precision
- Parallelism in DNNs
- Sequence Alignment algorithms
Introduction to the kernel of DBMSs and the execution of queries in such systems
This topic has the objective to understand the different software layers of a DBMS and how they interact, the different complexities that they incarnate and how their interaction determines the performance of such DBMSs.
Big data kernels and graph databases
This session has the objective to introduce the students to the design of such Big-data and graph database systems, how they are built and how their performance is influenced by the structure of their different software layers.
Benchmarking for Databases
Benchmarking is one of the most important issues in Database design and evolution. This part of the course will be designed to understand the different efforts being carried in benchmarking from the USA and Europe for relational and graph databases.
Big Data practical cases
Big Data practical cases

Activities

Activity Evaluation act

Introduction

Follow the lectures, study the materials and practices.

Theory: Introduction: overview The modern scientific method Simulation and optimization HPC vs. HTC Numerical simulations HPC = Algorithms + Architecture + Programming Model Limits of parallelization Evolution and limits of HPC systems
Autonomous learning: Study the associated contents and work on the assignments.

Objectives: 1
Contents:

1 . Introduction

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Part I: Numerical Applications

Follow the lectures, study the materials and practices.

Theory: 1. Introduction to Numerical Simulations - From Models to Algorithms - Discretization and PDEs - Types of PDEs: Elliptic, Parabolic, Hyperbolic - From problem to math and solution - Initial-value and Boundary-value Problems - Numerical schemes: Explicit vs. Implicit - Finite differences and Finite Elements 2. Direct Solving of Large-Scale Linear Systems of Equations - Triangular systems and parallelization - Gauss elimination - LU factorization - Partitioning methods - HPL 3. Iterative Solving of Large-Scale Linear Systems of Equations - Direct vs iterative methods - Jacobi - Parallelization of iterative methods - Gauss-Seidel and SOR - Krylov methods and preconditioning: HPCG - Software for numerical methods: BLAS, LAPACK, etc.

Objectives: 1 2 3
Contents:

2 . Introduction to Numerical Simulations
3 . Direct Solving of Large-Scale Linear Systems of Equations
4 . Iterative Solving of Large-Scale Linear Systems of Equations

Theory

Problems

Laboratory

Guided learning

Autonomous learning

10h

Numerical Laboratory

Contents:

5 . Numerical systems practical cases

Theory

Problems

Laboratory

Guided learning

Autonomous learning

12h

Deliverable: Assignment on Numerical Applications

Assignment for the Numerical Applications part. To be delivered at Racó.
Objectives: 1 2 3
Week: 6

Theory

Problems

Laboratory

Guided learning

Autonomous learning

12h

Part II: Non-Numerical Applications: ML & Bio-Informatics

Follow the lectures, study the materials and practices.

Theory: Principles of AI problems - Introduction to the common operations of DNNs - Most common DNN applications - Architectures for DNNs - Optimizations for DNNs Sequence Alignment
Autonomous learning: Study the associated contents and work on the assignments.

Objectives: 1 2 3
Contents:

6 . ML problems introduction
7 . Sequence Alignment

Theory

10h

Problems

Laboratory

Guided learning

Autonomous learning

10h

Non numerical applications laboratory

Laboratory: Profiling anaysis, dependency analysis, and parallelization of the an application using a parallel programming model

Contents:

8 . Practical Cases

Theory

Problems

Laboratory

Guided learning

Autonomous learning

10h

Deliverable: Non-numerical assignment

Objectives: 1 2 3
Week: 10

Theory

Problems

Laboratory

Guided learning

Autonomous learning

12h

Part III: Big Data problems

Follow the lectures, study the materials and practices.

Theory: Introduction to the kernel of DBMSs and the execution of queries in such systems Big data kernels and graph databases Benchmarking for Databases
Autonomous learning: Study the associated contents and work on the assignments.

Contents:

9 . Introduction to the kernel of DBMSs and the execution of queries in such systems
10 . Big data kernels and graph databases
11 . Benchmarking for Databases

Theory

10h

Problems

Laboratory

Guided learning

Autonomous learning

10h

Big Data Laboratory

Contents:

12 . Big Data practical cases

Theory

Problems

Laboratory

Guided learning

Autonomous learning

10h

Big Data assignment

Objectives: 1 2 3
Week: 15

Theory

Problems

Laboratory

Guided learning

Autonomous learning

12h

Teaching methodology

During the course there will be two types of activities:

a) Activities focused on the acquisition of theoretical knowledge.
b) Activities focused on the acquisition of knowledge through experimentation by implementing and evaluating empirically in the laboratory the mechanisms explained at a theoretical level.

The theoretical activities include participatory lecture classes, which explain the basic contents of the course. The practical activities include seminar laboratories using the student's laptop in class, where students implement the mechanisms described in the lectures. The seminars require a preparation by reading the statement and supporting documentation, and a further elaboration of the conclusions in a report.

Evaluation methodology

The course will be evaluated with a partial grade for each content part (Numerics, No numerics and Big Data). Each part has the same weight in the final grade:

Grade = A1/3 + A2/3 + A3/3

Where

Ai := Grade from part i (i from 1 to 3)

and

- Each part of the course has few short and partial deliverables (PD) and a final project deliverable (FPD).

Bibliography

Basic:

Finite element analysis: from concepts to applications - Burnett, D.S, Addison-Wesley, 1987. ISBN: 0201108062
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991000069979706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Numerical methods for engineers - Chapra, S.C.; Canale, R.P, McGraw-Hill, 2021. ISBN: 9781260571387
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004208719706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Iterative methods for sparse linear systems - Saad, Y, SIAM, 2003. ISBN: 0898715342
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002605179706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Parallel programming in C with MPI and OpenMP - Quinn, M.J, McGraw-Hill, 2003. ISBN: 0071232656
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002614899706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
A multigrid tutorial - Briggs, W.L.; Henson, V.E.; McCormick, S.F, Society for Industrial and Applied Mathematics, 2000. ISBN: 0898714621
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002338889706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
The ten most wanted solutions in protein bioinformatics - Tramontano, A, Chapman and Hall/CRC, 2005. ISBN: 1584884916
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004001559706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Transaction processing: concepts and techniques - Gray, J.; Reuter, A, Morgan Kaufmann, 1993. ISBN: 1558601902
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991000768969706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Analyzing network data in biology and medicine: an interdisciplinary textbook for biological, medical and computational scientists - Pržulj, N, Cambridge University Press, 2019. ISBN: 9781108432238
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991005261349406711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Complementary:

Higher-order numerical methods for transient wave equations - Cohen, G.C, Springer , 2001. ISBN: 354041598X
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002428189706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Full seismic waveform modelling and inversion - Fichtner, A, Springer , 2011. ISBN: 9783642158063
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003948259706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
An introduction to multigrid methods - Wesseling, P, Edwards , 2004. ISBN: 1930217080
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004000329706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Web links

Mesh partitioner software http://glaros.dtc.umn.edu/gkhome/views/metis/
OpenMP http://openmp.org/
MGNet http://www.mgnet.org/
Freely Available Software for Linear Algebra http://www.netlib.org/utk/people/JackDongarra/la-sw.html
MPI library http://www.open-mpi.org/

Previous capacities

Basic understanding of parallel architectures, including shared- and distributed-memory multiprocessor systems.

Useful programming skills of some parallel programming model.

Supercomputing for Challenging Applications

Teachers

Person in charge

Others

Weekly hours

Competences

Technical Competences of each Specialization

Advanced computing

High performance computing

Specific

Generic Technical Competences

Generic

Objectives

Contents

Activities

Introduction

Part I: Numerical Applications

Numerical Laboratory

Deliverable: Assignment on Numerical Applications

Part II: Non-Numerical Applications: ML & Bio-Informatics

Non numerical applications laboratory

Deliverable: Non-numerical assignment

Part III: Big Data problems

Big Data Laboratory

Big Data assignment

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Where we are

Contact with us

Supercomputing for Challenging Applications

You are here

Teachers

Person in charge

Others

Weekly hours

Competences

Technical Competences of each Specialization

Advanced computing

High performance computing

Specific

Generic Technical Competences

Generic

Objectives

Contents

Activities

Introduction

Part I: Numerical Applications

Numerical Laboratory

Deliverable: Assignment on Numerical Applications

Part II: Non-Numerical Applications: ML & Bio-Informatics

Non numerical applications laboratory

Deliverable: Non-numerical assignment

Part III: Big Data problems

Big Data Laboratory

Big Data assignment

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Where we are

Contact with us