Arquitectura de Supercomputadors

Hores setmanals
Competències
Objectius
Continguts
Activitats
Metodologia docent
Mètode d'avaluació
Bibliografia
Capacitats prèvies

Crèdits

Tipus

Complementària d'especialitat (Computació d'Altes Prestacions)

Requisits

Aquesta assignatura no té requisits, però té capacitats prèvies

Departament

This course introduces the fundamentals of high-performance and parallel computing, designed for scientists and engineers aiming to develop skills in working with supercomputers, the forefront of high-performance computing technology.

In the first part of the course, we will explore the basic building blocks of supercomputers and their system software stacks. We will then enter into traditional parallel and distributed programming models, essential for exploiting parallelism and scaling applications in conventional high-performance infrastructures.

In the second part of the course, we will review the hardware and software stack that allows the management of distributed GPU applications, which have become ubiquitous in high-performance computing worldwide installations over the past decade. These GPU-based systems deliver the majority of performance in the largest Pre-Exascale supercomputers, such as the Marenostrum 5 supercomputer.

The third part of the course will focus on understanding how contemporary supercomputing systems have been the true drivers of recent advances in artificial intelligence, with particular emphasis on the scalability of deep learning algorithms using these advanced high-performance computing installations based on GPUs.

Adopting a "learn by doing" approach, the course combines lectures, reading assignments, and hands-on exercises using one of Europe¿s fastest supercomputers, the Marenostrum 5 at the Barcelona Supercomputing Center (BSC-CNS). Assessment will be continuous, ensuring consistent and steady progress, with the aim of equipping students with practical skills to adapt to and anticipate new technologies in the evolving landscape of high-performance computing.

Professorat

Responsable

Jordi Torres Viñals ( )

Hores setmanals

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

7.5384

Competències

Competències Tècniques de cada especialitat

Computació d'altes prestacions

CEE4.1 - Capacitat d'analitzar, avaluar i dissenyar computadors i proposar noves tècniques de millora en la seva arquitectura.
CEE4.2 - Capacitat d'analitzar, avaluar, dissenyar i optimitzar programari considerant l'arquitectura i de proposar noves tècniques d'optimització.
CEE4.3 - Capacitat d'analitzar, avaluar, dissenyar i administrar programari de sistema en entorns de supercomputació.

Competències Tècniques Generals

Genèriques

CG1 - Capacitat per aplicar el mètode científic en l'estudi i anàlisi de fenòmens i sistemes en qualsevol àmbit de la Informàtica, així com en la concepció, disseny i implantació de solucions informàtiques innovadores i originals.

Competències Transversals

Treball en equip

CTR3 - Ser capaç de treballar com a membre d'un equip, ja sigui com a un membre més, ja sigui realitzant tasques de direcció, amb la finalitat de contribuir a desenvolupar projectes d'una manera pragmàtica i amb sentit de la responsabilitat; assumir compromisos tenint en compte els recursos disponibles.

Bàsiques

CB6 - Que els estudiants sàpiguen aplicar els coneixements adquirits y la seva capacitat de resolució de problemes en entorns nous o poc coneguts dins de contexts més amplis (o multidisciplinaris) relacionats amb la seva àrea d'estudi.
CB8 - Que els estudiants sàpiguen comunicar les seves conclusions i els coneixements i raons darreres que les sustenten- a públics especialitzats i no especialitzats d'una manera clara i sense ambigüitats.
CB9 - Que els estudiants posseeixin les habilitats d'aprenentatge que els permetin continuar estudiant d'una manera que haurà de ser en gran mesura autodirigida o autònoma.

Objectius

Capacitar els estudiants per seguir per si mateixos el continu desenvolupament de sistemes de supercomputació que permeten la convergència d'algoritmes analítics avançats o la intel.ligència artificial.
Competències relacionades: CB6, CB8, CB9, CTR3, CEE4.1, CEE4.2, CEE4.3, CG1,

Continguts

00. Benvinguda: Contingut del curs i motivació
01. Fonaments de supercomputació
02. Heterogeneous supercomputers
03. Supercomputer management and storage systems
04. Benchmarking supercomputers
05. Data center infrastructures
06. Parallel programming models
07. Parallel performance models
08. Parallel programming languages for heterogeneous platforms
09. Artificial Intelligence is a computing problem
10. Deep Learning essential concepts
11. Using Supercomputers for DL training
12. Accelerate the learning with parallel training on multi-GPUs
13. Accelerate the learning with distributed training on multiple parallel servers
14. How to speed up the training of Transformers-based models

Activitats

Activitat Acte avaluatiu

00. Welcome

Objectius: 1

Teoria

0.5h

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

01. Supercomputing basics

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

3.5h

Exercise 01: Supercomputing impact

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

02. Heterogeneous supercomputers

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 02: Getting started with storage and management systems

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

03. Supercomputer management and storage systems

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 03: Exascale computers challenge

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

04. Benchmarking supercomputers

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 04: Getting started with parallel programming models

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

05. Data centers infrastructures

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 05: Getting started with parallel performance metrics

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

06. Parallel programming models

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 06: Getting started with parallel performance models

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

07. Parallel performance models

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 07: Emerging trends in supercomputing

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

08. Parallel programming languages for heterogeneous platforms

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 08: Getting started with CUDA

Teoria

0.5h

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Midterm

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

10.5h

09. Artificial Intelligence is a Supercomputing problem

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 09: First contact with Deep Learning and Supercomputing

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

10. Deep Learning essential concepts

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 10: The new edition of the TOP500

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

11. Using Supercomputers for DL training

Teoria

1.5h

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 11: Using a supercomputer for Deep Learning training

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

12. Accelerate the learning with parallel training using a multi-GPU parallel server

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 12: Accelerate the learning with parallel training using a multi-GPU parallel server

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

13. Accelerate the learning with distributed training using multiple parallel servers

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 13: Accelerate the learning with distributed training using multiple parallel server

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

14. How to speed up the training of Transformers-based models

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Exercise 14: How to speed up the training of Transformers-based models

Teoria

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Final remarks

Teoria

0.5h

Problemes

Laboratori

Aprenentatge dirigit

Aprenentatge autònom

Metodologia docent

Class attendance and participation: Regular attendance is expected, and is required to be able to discuss concepts that will be covered during class.

Lab activities: Some exercises will be conducted as hands-on sessions during the course using supercomputing facilities. The student's own laptop will be required to access these resources during the theory class. Each hands-on session will involve writing a lab report with all the results. There are no days for theory classes and days for laboratory classes. Theoretical and practical activities will be interspersed during the same session to facilitate the learning process.

Reading/presentation assignments: Some exercise assignments will consist of reading documentation/papers that expand the concepts introduced during lectures. Some exercises will involve student presentations (randomly chosen).

Assessment: There will be one midterm exam in the middle of the course. The student is allowed to use any type of documentation (also digital via the student's laptop).

Mètode d'avaluació

The evaluation of this course can be obtained by continuous assessment. This assessment will take into account the following:

20% Attendance + participation
10% Midterm exam
70% Exercises (+ exercise presentations) and Lab exercises (+ Lab reports)

Students who have not benefited from continuous assessment have the opportunity to take a final Course Exam. This exam includes evaluating the knowledge of the entire course (practical part, theoretical part, and self-learning part). During this course exam, the student is not allowed to use any documentation (neither on paper nor digital).

Bibliografia

Bàsica:

Supercomputing for Artificial Intelligence: Foundations, Architectures, and Scaling Deep Learning - Torres, Jordi, WATCH THIS SPACE Book Series - Barcelona. Amazon KDP, 2025. ISBN: 9798319328359
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991005476510706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Capacitats prèvies

Programming in C and Linux basics will be expected in the course. In addition, prior exposure to parallel programming constructions, Python language, experience with linear algebra/matrices, or machine learning knowledge will be helpful.

Arquitectura de Supercomputadors

Esteu aquí

Professorat

Responsable

Hores setmanals

Competències

Competències Tècniques de cada especialitat

Computació d'altes prestacions

Competències Tècniques Generals

Genèriques

Competències Transversals

Treball en equip

Bàsiques

Objectius

Continguts

Activitats

00. Welcome

01. Supercomputing basics

Exercise 01: Supercomputing impact

02. Heterogeneous supercomputers

Exercise 02: Getting started with storage and management systems

03. Supercomputer management and storage systems

Exercise 03: Exascale computers challenge

04. Benchmarking supercomputers

Exercise 04: Getting started with parallel programming models

05. Data centers infrastructures

Exercise 05: Getting started with parallel performance metrics

06. Parallel programming models

Exercise 06: Getting started with parallel performance models

07. Parallel performance models

Exercise 07: Emerging trends in supercomputing

08. Parallel programming languages for heterogeneous platforms

Exercise 08: Getting started with CUDA

Midterm

09. Artificial Intelligence is a Supercomputing problem

Exercise 09: First contact with Deep Learning and Supercomputing

10. Deep Learning essential concepts

Exercise 10: The new edition of the TOP500

11. Using Supercomputers for DL training

Exercise 11: Using a supercomputer for Deep Learning training

12. Accelerate the learning with parallel training using a multi-GPU parallel server

Exercise 12: Accelerate the learning with parallel training using a multi-GPU parallel server

13. Accelerate the learning with distributed training using multiple parallel servers

Exercise 13: Accelerate the learning with distributed training using multiple parallel server

14. How to speed up the training of Transformers-based models

Exercise 14: How to speed up the training of Transformers-based models

Final remarks

Metodologia docent

Mètode d'avaluació

Bibliografia

Bàsica:

Capacitats prèvies

On som

Contacta amb la FIB