Computación de Altas Prestaciones para la Inteligencia Artificial

Horas semanales
Competencias
Objetivos
Contenidos
Actividades
Metodología docente
Método de evaluación
Bibliografía
Web links
Capacidades previas

Créditos

Tipos

Optativa

Requisitos

Esta asignatura no tiene requisitos, pero tiene capacidades previas

Departamento

Web

https://torres.ai/HPC4AI-MEI

Mail

High Performance Computing for Artificial Intelligence (HPC4AI) is a master-level, practical-oriented course focused on understanding how modern AI training workloads actually run on real supercomputing infrastructures.

Rather than treating deep learning frameworks and tools as black boxes, the course adopts a system-oriented perspective. It guides students through the complete execution workflow of AI training, from hardware architecture and system software to job scheduling, parallel execution, performance measurement, and scalability analysis. The emphasis is on execution behavior: how computation, memory, communication, and coordination interact, and how these interactions determine performance, efficiency, and cost.

A central premise of the course is that the nature of engineering work in AI is changing. Modern AI tools can generate training scripts, pipelines, and even distributed execution logic with minimal effort. As a result, writing code is no longer the primary challenge. The real difficulty, and the real value, lies in understanding whether that code scales, where bottlenecks appear, when efficiency is lost, and what trade-offs are being made when more resources are used.

For this reason, the course explicitly allows and acknowledges the use of modern AI tools (such as code assistants, agentic systems, or automated code generators). However, the course is not about code authorship or syntax. It is about developing the ability to reason about performance, scalability, efficiency, and cost when training deep learning models on real HPC systems. Students are expected to understand what is being executed, how it behaves at scale, and why performance changes as observed.

Hands-on experimentation is a core component of the course. Through a sequence of laboratory activities, students train deep learning models using single and multiple GPUs, explore parallel and distributed training strategies, and analyze scalability and performance behavior under realistic conditions. All laboratory work and assessments are evaluated based on the quality of experimental setup, the relevance of performance measurements, the interpretation of results, and the soundness of scalability and cost¿benefit reasoning.

The course material is self-contained and based on the official course textbook, which serves as the main reference for both theoretical concepts and practical activities. No prior experience with supercomputers is required, and deep learning concepts are introduced progressively as needed.

Ultimately, HPC4AI is not a course about recipes or fixed solutions. It is a course about developing engineering judgment. As code generation becomes cheaper and more accessible, the ability to measure, reason, and decide becomes essential. This course is designed to develop precisely that ability.

Details specific to the 2026 edition of the course can be found on the course web page:
https://torres.ai/HPC4AI-MEI

Profesorado

Responsable

Jordi Torres Viñals ( )

Horas semanales

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

7.1

Competencias

Competencias Técnicas de cada especialidad

Dirección y gestión

CDG1 - Capacidad para la integración de tecnologías, aplicaciones, servicios y sistemas propios de la Ingeniería Informática, con carácter generalista, y en contextos más amplios y multidisciplinares.

Específicas

CTE6 - Capacidad para diseñar y evaluar sistemas operativos y servidores, y aplicaciones y sistemas basados en computación distribuida.
CTE9 - Capacidad para aplicar métodos matemáticos, estadísticos y de inteligencia artificial para modelar, diseñar y desarrollar aplicaciones, servicios, sistemas inteligentes y sistemas basados en el conocimiento.

Competencias Técnicas Genéricas

Genéricas

CG1 - Capacidad para proyectar, calcular y diseñar productos, procesos e instalaciones en todos los ámbitos de la ingeniería informática.
CG4 - Capacidad para el modelado matemático, cálculo y simulación en centros tecnológicos y de ingeniería de empresa, particularmente en tareas de investigación, desarrollo e innovación en todos los ámbitos relacionados con la Ingeniería en Informática.
CG6 - Capacidad para la dirección general, dirección técnica y dirección de proyectos de investigación, desarrollo e innovación, en empresas y centros tecnológicos, en el ámbito de la Ingeniería Informática.
CG7 - Capacidad para la puesta en marcha, dirección y gestión de procesos de fabricación de equipos informáticos, con garantía de la seguridad para las personas y bienes, la calidad final de los productos y su homologación.
CG8 - Capacidad para la aplicación de los conocimientos adquiridos y de resolver problemas en entornos nuevos o poco conocidos dentro de contextos más amplios y mulitidisciplinares, siendo capaces de integrar estos conocimientos.

Competencias Transversales

Actitud frente al trabajo

CTR5 - Tener motivación para la realización profesional y para afrontar nuevos retos, así como una visión amplia de las posibilidades de la carrera profesional en el ámbito de la Ingeniería en Informática. Tener motivación por la calidad y la mejora continua, y actuar con rigor en el desarrollo profesional. Capacidad de adaptación a los cambios organizativos o tecnológicos. Capacidad de trabajar en situaciones de falta de información y/o con restricciones temporales y/o de recursos.

Básicas

CB6 - Que los estudiantes sepan aplicar los conocimientos adquiridos y su capacidad de resolución de problemas en entornos nuevos o poco conocidos dentro de contextos más amplios (o multidisciplinares) relacionados con su área de estudio.
CB8 - Que los estudiantes sepan comunicar sus conclusiones y los conocimientos y razones últimas que las sustentan a públicos especializados y no especializados de un modo claro y sin ambigüedades.
CB9 - Que los estudiantes posean las habilidades de aprendizaje que les permitan continuar estudiando de un modo que habrá de ser en gran medida autodirigido o autónomo.

Objetivos

OE1: Foundations of HPC platforms for AI: comprender la arquitectura, los componentes principales y el entorno software de una plataforma de supercomputación moderna orientada a cargas de trabajo de inteligencia artificial.
Competencias relacionadas: CTE6, CG1, CG6, CG7, CG8,
OE2: Practical use of a supercomputer for AI workloads: adquirir autonomía básica en el uso de un supercomputador real, incluyendo acceso, gestión de recursos y ejecución de trabajos para aplicaciones de inteligencia artificial.
Competencias relacionadas: CB6, CTE6, CG1, CG8,
OE3: Fundamentals of Deep Learning for HPC users: entender los principios fundamentales del Deep Learning necesarios para entrenar modelos en entornos de supercomputación, sin requerir conocimientos previos avanzados.
Competencias relacionadas: CTE9, CG4, CG8,
OE4: Parallel training of Deep Learning models: comprender y aplicar técnicas de entrenamiento paralelo de modelos de Deep Learning utilizando múltiples GPUs en uno o varios nodos (servidores) de computación.
Competencias relacionadas: CB6, CB9, CTE6, CTE9, CG1,
OE5: Performance analysis and optimization of AI training: analizar el rendimiento del entrenamiento de modelos de inteligencia artificial mediante métricas como throughput, speedup y eficiencia, y aplicar técnicas básicas de optimización.
Competencias relacionadas: CTE6, CTE9, CG1, CG4,
OE6: Experimental evaluation and communication of results: evaluar experimentalmente los resultados obtenidos en un entorno de supercomputación y comunicar conclusiones técnicas de forma clara, estructurada y argumentada.
Competencias relacionadas: CB8, CB9, CTR5, CDG1,

Contenidos

C1: HPC platforms and software ecosystem for AI
Arquitectura de supercomputadores modernos, componentes hardware, sistema operativo y stack software para cargas de trabajo de inteligencia artificial.
C2: Accessing and using a supercomputer for AI workloads
Acceso a un supercomputador, gestión de cuentas, sistemas de colas, SLURM y ejecución de trabajos para aplicaciones de Deep Learning.
C3: Deep Learning fundamentals for HPC environments
Conceptos básicos de Deep Learning necesarios para entrenar modelos en entornos HPC, incluyendo redes neuronales, entrenamiento y datasets (no podemos suponer conocimientos previos).
C4: Parallel training of Deep Learning models
Entrenamiento paralelo de modelos de Deep Learning utilizando múltiples GPUs, incluyendo estrategias de paralelismo y frameworks de programación.
C5: Performance metrics and optimization of AI training
Análisis del rendimiento del entrenamiento de modelos de IA mediante métricas como throughput, speedup y eficiencia, y técnicas básicas de optimización.
C6: Experimental evaluation and presentation of results
Evaluación experimental de resultados obtenidos en un entorno HPC y comunicación clara de conclusiones mediante informes y presentaciones técnicas.

Actividades

Actividad Acto evaluativo

A1: Course introduction

Objetivos: 1
Contenidos:

1 . C1: HPC platforms and software ecosystem for AI

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A2: HPC platforms and software stack for AI

Objetivos: 1
Contenidos:

1 . C1: HPC platforms and software ecosystem for AI

Teoría

2.5h

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A3: Using a supercomputer for AI workloads

Objetivos: 2
Contenidos:

2 . C2: Accessing and using a supercomputer for AI workloads

Teoría

Problemas

Laboratorio

1.5h

Aprendizaje dirigido

Aprendizaje autónomo

A4 ¿ Deep Learning fundamentals for HPC environments

Objetivos: 3
Contenidos:

3 . C3: Deep Learning fundamentals for HPC environments

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

4.5h

A5: Parallel training of Deep Learning models

Objetivos: 4
Contenidos:

4 . C4: Parallel training of Deep Learning models

Teoría

2.5h

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A6: Performance analysis and optimization of AI training

Objetivos: 4 5
Contenidos:

4 . C4: Parallel training of Deep Learning models
5 . C5: Performance metrics and optimization of AI training

Teoría

1.5h

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A7: Distributed training and scalability

Objetivos: 4 5
Contenidos:

4 . C4: Parallel training of Deep Learning models
5 . C5: Performance metrics and optimization of AI training

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A8: Hands-on laboratory work

Objetivos: 2 4 5
Contenidos:

2 . C2: Accessing and using a supercomputer for AI workloads
4 . C4: Parallel training of Deep Learning models
5 . C5: Performance metrics and optimization of AI training

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

10h

A9: Practical project development

Objetivos: 6
Contenidos:

6 . C6: Experimental evaluation and presentation of results

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A10: Technical presentations and peer evaluation

Objetivos: 6
Contenidos:

6 . C6: Experimental evaluation and presentation of results

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

A11: Attendance and participation

Objetivos: 6
Contenidos:

6 . C6: Experimental evaluation and presentation of results

Teoría

Problemas

Laboratorio

Aprendizaje dirigido

Aprendizaje autónomo

0.4h

Metodología docente

The course follows an active learning and continuous assessment approach, combining theoretical lectures, hands-on laboratory work, autonomous learning, and student presentations.

Theoretical sessions are delivered through participatory lectures, where the instructor introduces the fundamental concepts related to high-performance computing platforms, deep learning fundamentals, parallel training strategies, and performance analysis for artificial intelligence workloads. Students are expected to actively participate in discussions during these sessions.

Hands-on activities constitute a central component of the course and are based on a learn-by-doing methodology. These activities focus on practical experimentation using a real supercomputing environment (MareNostrum 5). Part of the hands-on work is carried out during regular class sessions, while the remaining work is completed outside the classroom as autonomous learning. All hands-on activities require the submission of corresponding reports and, in some cases, technical presentations through the institutional learning platform (Racó).

Autonomous learning is mainly based on the detailed study of the course textbook, which constitutes the main reference material for the subject. Students are also required to prepare presentations and technical material related to their practical work.

Student presentations play an important role in the course. Individual students or groups are randomly selected to present their work and results in class. Peer evaluation is incorporated as part of the learning process, encouraging critical analysis and constructive feedback.

Regular attendance and active participation are expected. Students are responsible for all material covered in class, including announcements, assignments, and project guidelines, regardless of attendance. It is the student¿s responsibility to obtain any missed material.

Método de evaluación

The evaluation of this course is based on a continuous assessment system, strongly focused on practical work and active participation.

The final grade is composed of the following elements:

- Attendance and participation: 20%
Regular attendance and active participation in lectures, discussions, and hands-on sessions.
Attendance is mandatory. To qualify for continuous assessment, students must attend at least 80% of the class sessions.

- Hands-on activities (laboratory work): 60%
Evaluation of the practical laboratory activities carried out throughout the course (LAB 0 to LAB 4).
The instructor will assess the submitted work using a rubric that considers correctness, completeness, experimental results, and technical understanding.
Some students or groups will be randomly selected during the course to present and explain their laboratory work (LAB 0 to LAB 2). This mechanism is intended to ensure that all students prepare and understand their work thoroughly.

- Technical presentations and peer evaluation: 20%
During the final session of the course, all students will present either LAB 3 or LAB 4 (assigned randomly).
Presentations will be evaluated by the instructor and through peer evaluation, which will contribute to the final presentation grade.

Attendance on the presentation day is mandatory. Students who do not attend this session will not receive the presentation grade.

Requirements for continuous assessment: To qualify for continuous assessment, students must meet all the following requirements:
- Attendance: at least 80% of the class sessions.
- Hands-on activities: completion of at least 50% of the laboratory work.

Final exam option
- Students who do not meet the requirements for continuous assessment will have the option to take a final exam.
- This exam will evaluate the entire course content, including theoretical concepts, practical knowledge, and autonomous learning material based on the course book and laboratory activities.
- The final exam will be announced during the course. No documentation (printed or digital) will be allowed during the exam.

Bibliografía

Básica:

Supercomputing for Artificial Intelligence: Foundations, Architectures, and Scaling Deep Learning - Torres, Jordi, WATCH THIS SPACE Book Series - Barcelona. Amazon KDP, 2025. ISBN: 979-831932835-9
Slides of the course - Torres, J,

Web links

https://torres.ai/HPC4AI-MEI

Capacidades previas

Python is the programming language of choice for the labs' sessions of this course. It is assumed that the student has a basic knowledge of Python prior to starting classes. Also, some experience with Linux basics will be necessary.

Computación de Altas Prestaciones para la Inteligencia Artificial

Profesorado

Responsable

Horas semanales

Competencias

Competencias Técnicas de cada especialidad

Dirección y gestión

Específicas

Competencias Técnicas Genéricas

Genéricas

Competencias Transversales

Actitud frente al trabajo

Básicas

Objetivos

Contenidos

Actividades

A1: Course introduction

A2: HPC platforms and software stack for AI

A3: Using a supercomputer for AI workloads

A4 ¿ Deep Learning fundamentals for HPC environments

A5: Parallel training of Deep Learning models

A6: Performance analysis and optimization of AI training

A7: Distributed training and scalability

A8: Hands-on laboratory work

A9: Practical project development

A10: Technical presentations and peer evaluation

A11: Attendance and participation

Metodología docente

Método de evaluación

Bibliografía

Básica:

Web links

Capacidades previas

Dónde estamos

Contacta con la FIB

Computación de Altas Prestaciones para la Inteligencia Artificial

Usted está aquí

Profesorado

Responsable

Horas semanales

Competencias

Competencias Técnicas de cada especialidad

Dirección y gestión

Específicas

Competencias Técnicas Genéricas

Genéricas

Competencias Transversales

Actitud frente al trabajo

Básicas

Objetivos

Contenidos

Actividades

A1: Course introduction

A2: HPC platforms and software stack for AI

A3: Using a supercomputer for AI workloads

A4 ¿ Deep Learning fundamentals for HPC environments

A5: Parallel training of Deep Learning models

A6: Performance analysis and optimization of AI training

A7: Distributed training and scalability

A8: Hands-on laboratory work

A9: Practical project development

A10: Technical presentations and peer evaluation

A11: Attendance and participation

Metodología docente

Método de evaluación

Bibliografía

Básica:

Web links

Capacidades previas

Dónde estamos

Contacta con la FIB