This course wants to delve into the topics already studied in the introductory course to parallelism (PAR). In particular the course works in three directions: 1) implementation of a shared memory programming model such as OpenMP, using the mechanisms provided by a low-level threads library (pthreads) and the code generated by a compiler (gcc); 2) design of a distributed-memory cluster from its basic components: multiprocessor/multicore nodes, accelerators, network interfaces and other components to create the interconnection network; and 3) to study a programming model for cluster architectures (MPI).
Teachers
Person in charge
Eduard Ayguadé Parra (
)
Weekly hours
Theory
1
Problems
1
Laboratory
2
Guided learning
0
Autonomous learning
6
Competences
Transversal Competences
Appropiate attitude towards work
G8 [Avaluable] - To have motivation to be professional and to face new challenges, have a width vision of the possibilities of the career in the field of informatics engineering. To feel motivated for the quality and the continuous improvement, and behave rigorously in the professional development. Capacity to adapt oneself to organizational or technological changes. Capacity to work in situations with information shortage and/or time and/or resources restrictions.
G8.3
- To be motivated for the professional development, to face new challenges and the continuous improvement. To have capacity to work in situations with a lack of information.
Technical Competences of each Specialization
Computer engineering specialization
CEC2 - To analyse and evaluate computer architectures including parallel and distributed platforms, and develop and optimize software for these platforms.
CEC2.1
- To analyse, evaluate, select and configure hardware platforms for the development and execution of computer applications and services.
CEC2.2
- To program taking into account the hardware architecture, using assembly language as well as high-level programming languages.
Technical Competences
Common technical competencies
CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.
CT8.7
- To control project versions and configurations.
Objectives
Ability to write and understand parallel programs that make use of the low-level Pthreads interface.
Related competences:
CEC2.2,
Ability to implement the basic functionalities in a library supporting the execution of parallel applications on a shared-memory architecture.
Related competences:
CT8.7,
CEC2.2,
G8.3,
Ability to understand the main components used to build a multiprocessor architecture, and design on paper a system that fulfill certain design restrictions.
Related competences:
CEC2.1,
G8.3,
Ability to write simple applications using the MPI programming model, evaluate their performance and identify the critical parts that limit scalability.
Related competences:
CEC2.2,
G8.3,
Ability to assess the quality of a proposed solution to a specific problem
Related competences:
G8.3,
Ability to autonomously complete or expand knowledge and to perform a specific job even though the statement is incomplete or information relevant to the implementation is missing
Related competences:
G8.3,
Contents
MPI: parallel programming for distributed-memory architectures
This topic will introduce how to program parallel applications using MPI, a programming model based on message passing for distributed-memory cluster architectures.
Parallel programming using Pthreads
Introduction to the basic functionalities that are offered by the Pthreads low-level support library
Implementation of a shared-memory programming model: threads and synchronization, work sharing and tasking model
This topic presents how to design and implement a library supporting the execution of parallel programs in OpenMP, in particular the aspects related with thread management and synchronization, work sharing in the OpenMP worksharing constructs and the tasking model.
Components and design of a cluster architecture
This topic will introduce the main components in a cluster architecture we the objective of doing a design with certain performance/power trade-offs and budget.
The theory lessons introduce the knowledge, techniques, and concepts using examples of real code or pseudo-code. These lessons will be complemented with the realization of problems in the practical lessons. The laboratory sessions put into practice the theoretical contents, and evaluate the behavior and performance of the solutions proposed.
The course assumes that part of the theoretical contents, or laboratory statements, will have to be developed by the student independently.
The course is mainly focused on cluster architectures, using the C programming language, the Pthreads library and the OpenMP and MPI programming models.
Evaluation methodology
The grade for the course is calculated from 3 grades:
- theoretical contents grade;
- laboratory grade;
- autonomy and motivation grade.
The theoretical contents grade (T) is obtained from the marks contained in the midterm (50%) and final exam (50%). These two exams can be skipped if the student delivers and defends at least 70% of the exercises requested by the professor during the theory classes.
The laboratory grade (L) is obtained from the marks of the laboratory deliverables and monitoring of the laboratory sessions by the professor.
The grade of autonomy and motivation (A) evaluates the ability of students to face situations of lack of information and their motivation to explore additional topics or go beyond what is initially assigned. It is obtained from: 1) the results of those laboratory experiments that require the exploration of extra material and/or performing optional/free parts; and 2) the design on paper of a cluster for HPC.
The final grade is calculated F = T * 0.4 + L * 0.4 + A * 0.2.