Parallel Programming and Architectures

You are here

Credits
6
Types
Specialization complementary (Computer Engineering)
Requirements
  • Prerequisite: PAR
Department
AC
This course wants to delve into the topics already studied in the introductory course to parallelism (PAR). In particular the course works in three directions: 1) implementation of a shared memory programming model such as OpenMP, using the mechanisms provided by a low-level threads library (pthreads) and the code generated by a compiler (gcc); 2) design of a distributed-memory cluster from its basic components: multiprocessor/multicore nodes, accelerators, network interfaces and other components to create the interconnection network; and 3) to study a programming model for cluster architectures (MPI).

Teachers

Person in charge

  • Eduard Ayguadé Parra ( )

Weekly hours

Theory
1
Problems
1
Laboratory
2
Guided learning
0
Autonomous learning
6

Competences

Transversal Competences

Appropiate attitude towards work

  • G8 [Avaluable] - To have motivation to be professional and to face new challenges, have a width vision of the possibilities of the career in the field of informatics engineering. To feel motivated for the quality and the continuous improvement, and behave rigorously in the professional development. Capacity to adapt oneself to organizational or technological changes. Capacity to work in situations with information shortage and/or time and/or resources restrictions.
    • G8.3 - To be motivated for the professional development, to face new challenges and the continuous improvement. To have capacity to work in situations with a lack of information.

Technical Competences of each Specialization

Computer engineering specialization

  • CEC2 - To analyse and evaluate computer architectures including parallel and distributed platforms, and develop and optimize software for these platforms.
    • CEC2.1 - To analyse, evaluate, select and configure hardware platforms for the development and execution of computer applications and services.
    • CEC2.2 - To program taking into account the hardware architecture, using assembly language as well as high-level programming languages.

Technical Competences

Common technical competencies

  • CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.
    • CT8.7 - To control project versions and configurations.

Objectives

  1. Ability to write and understand parallel programs that make use of the low-level Pthreads interface.
    Related competences: CEC2.2,
  2. Ability to implement the basic functionalities in a library supporting the execution of parallel applications on a shared-memory architecture.
    Related competences: CT8.7, CEC2.2, G8.3,
  3. Ability to understand the main components used to build a multiprocessor architecture, and design on paper a system that fulfill certain design restrictions.
    Related competences: CEC2.1, G8.3,
  4. Ability to write simple applications using the MPI programming model, evaluate their performance and identify the critical parts that limit scalability.
    Related competences: CEC2.2, G8.3,
  5. Ability to assess the quality of a proposed solution to a specific problem
    Related competences: G8.3,
  6. Ability to autonomously complete or expand knowledge and to perform a specific job even though the statement is incomplete or information relevant to the implementation is missing
    Related competences: G8.3,

Contents

  1. MPI: parallel programming for distributed-memory architectures
    This topic will introduce how to program parallel applications using MPI, a programming model based on message passing for distributed-memory cluster architectures.
  2. Parallel programming using Pthreads
    Introduction to the basic functionalities that are offered by the Pthreads low-level support library
  3. Implementation of a shared-memory programming model: threads and synchronization, work sharing and tasking model
    This topic presents how to design and implement a library supporting the execution of parallel programs in OpenMP, in particular the aspects related with thread management and synchronization, work sharing in the OpenMP worksharing constructs and the tasking model.
  4. Components and design of a cluster architecture
    This topic will introduce the main components in a cluster architecture we the objective of doing a design with certain performance/power trade-offs and budget.

Activities

Activity Evaluation act


Parallel programming with message passing: MPI

-
Objectives: 4 5 6
Contents:
Theory
5h
Problems
5h
Laboratory
12h
Guided learning
0h
Autonomous learning
36h

POSIX threads (Pthreads)


Objectives: 1
Contents:
Theory
3h
Problems
3h
Laboratory
2h
Guided learning
0h
Autonomous learning
12h

Implementation of a shared-memory programming model

-
Objectives: 1 2 5 6
Contents:
Theory
2h
Problems
2h
Laboratory
16h
Guided learning
0h
Autonomous learning
30h

Components and design of a cluster architecture

-
Objectives: 3 6
Contents:
Theory
3h
Problems
5h
Laboratory
0h
Guided learning
0h
Autonomous learning
12h

Final Exam


Objectives: 1 2 3 4
Week: 15 (Outside class hours)
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h

Teaching methodology

The theory lessons introduce the knowledge, techniques, and concepts using examples of real code or pseudo-code. These lessons will be complemented with the realization of problems in the practical lessons. The laboratory sessions put into practice the theoretical contents, and evaluate the behavior and performance of the solutions proposed.

The course assumes that part of the theoretical contents, or laboratory statements, will have to be developed by the student independently.

The course is mainly focused on cluster architectures, using the C programming language, the Pthreads library and the OpenMP and MPI programming models.

Evaluation methodology

The grade for the course is calculated from 3 grades:
- theoretical contents grade;
- laboratory grade;
- autonomy and motivation grade.

The theoretical contents grade (T) is obtained from the marks contained in the midterm (50%) and final exam (50%). These two exams can be skipped if the student delivers and defends at least 70% of the exercises requested by the professor during the theory classes.

The laboratory grade (L) is obtained from the marks of the laboratory deliverables and monitoring of the laboratory sessions by the professor.

The grade of autonomy and motivation (A) evaluates the ability of students to face situations of lack of information and their motivation to explore additional topics or go beyond what is initially assigned. It is obtained from: 1) the results of those laboratory experiments that require the exploration of extra material and/or performing optional/free parts; and 2) the design on paper of a cluster for HPC.

The final grade is calculated F = T * 0.4 + L * 0.4 + A * 0.2.

Bibliography

Basic:

Complementary:

  • Unit 1: POSIX Threads (Pthreads) programming - Ayguadé, Eduard, Departament d'Arquitectura de Computadors , 2021.
  • Unit 2: Build (on paper) your own cluster architecture - Ayguadé, Eduard, Departament d'Arquitectura de Computadors , 2021.
  • Unit 3: MPI (Message Passing Interface) - Ayguadé, Eduard, Departament d'Arquitectura de Computadors , 2021.
  • Laboratory assignments: Lab 1 - OpenMP parallelisation of Eratosthenes Sieve - Ayguadé, Eduard, Departament d'Arquitecture de Computadors , 2023.
  • Laboratory assignments: Lab2 - Implementing a minimal OpenMP runtime - Ayguadé, Eduard, Departament d'Arquitectura de Computadors , 2021.
  • Laboratory assignments: Lab 3 - Performance characteritzation of HPC clusters - Ayguadé, Eduard ; Àlvarez, Lluc, Departament d'Arquitectura de Computadors , 2023.
  • Laboratory assignments: Lab 4 - Heat equation using MPI - Ayguadé, Eduard ; Àlvarez, Lluc, Departament d'Arquitectura de Computadors , 2023.

Web links

Previous capacities

Defined by the pre-requisites for the course