Parallel Programming and Architectures

You are here

Credits
6
Types
Specialization complementary (Computer Engineering)
Requirements
  • Prerequisite: PAR
Department
AC
This course wants to delve into the topics already studied in the introductory course to parallelism (PAR). In particular the course works in three directions: 1) implementation of a shared memory programming model such as OpenMP, using the mechanisms provided by a low-level threads library (pthreads) and the code generated by a compiler (gcc); 2) design of a distributed-memory cluster from its basic components: multiprocessor/multicore nodes, accelerators, network interfaces and other components to create the interconnection network; and 3) to study a programming model for cluster architectures (MPI).

Teachers

Person in charge

  • Eduard Ayguadé Parra ( )

Weekly hours

Theory
1
Problems
1
Laboratory
2
Guided learning
0.4
Autonomous learning
5.6

Competences

Transversal Competences

Appropiate attitude towards work

  • G8 [Avaluable] - To have motivation to be professional and to face new challenges, have a width vision of the possibilities of the career in the field of informatics engineering. To feel motivated for the quality and the continuous improvement, and behave rigorously in the professional development. Capacity to adapt oneself to organizational or technological changes. Capacity to work in situations with information shortage and/or time and/or resources restrictions.
    • G8.3 - To be motivated for the professional development, to face new challenges and the continuous improvement. To have capacity to work in situations with a lack of information.

Technical Competences of each Specialization

Computer engineering specialization

  • CEC2 - To analyse and evaluate computer architectures including parallel and distributed platforms, and develop and optimize software for these platforms.
    • CEC2.1 - To analyse, evaluate, select and configure hardware platforms for the development and execution of computer applications and services.
    • CEC2.2 - To program taking into account the hardware architecture, using assembly language as well as high-level programming languages.

Technical Competences

Common technical competencies

  • CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.
    • CT8.7 - To control project versions and configurations.

Objectives

  1. Ability to write and understand parallel programs that make use of the low-level Pthreads interface.
    Related competences: CEC2.2,
  2. Ability to implement the basic functionalities in a library supporting the execution of parallel applications on a shared-memory architecture.
    Related competences: G8.3, CT8.7, CEC2.2,
  3. Ability to understand the main components used to build a multiprocessor architecture, and design on paper a system that fulfill certain design restrictions.
    Related competences: G8.3, CEC2.1,
  4. Ability to write simple applications using the MPI programming model, evaluate their performance and identify the critical parts that limit scalability.
    Related competences: G8.3, CEC2.2,
  5. Ability to assess the quality of a proposed solution to a specific problem
    Related competences: G8.3,
  6. Ability to autonomously complete or expand knowledge and to perform a specific job even though the statement is incomplete or information relevant to the implementation is missing
    Related competences: G8.3,

Contents

  1. Parallel programming using Pthreads
    Introduction to the basic functionalities that are offered by the Pthreads low-level support library
  2. Implementation of a shared-memory programming model: threads and synchronization, work sharing and tasking model
    This topic presents how to design and implement a library supporting the execution of parallel programs in OpenMP, in particular the aspects related with thread management and synchronization, work sharing in the OpenMP worksharing constructs and the tasking model.
  3. Components and design of a cluster architecture
    This topic will introduce the main components in a cluster architecture we the objective of doing a design with certain performance/power trade-offs and budget.
  4. MPI: parallel programming for distributed-memory architectures
    This topic will introduce how to program parallel applications using MPI, a programming model based on message passing for distributed-memory cluster architectures.

Activities

Activity Evaluation act


POSIX threads (Pthreads)


Objectives: 1
Contents:
Theory
3h
Problems
3h
Laboratory
0h
Guided learning
0h
Autonomous learning
6h

Implementation of a shared-memory programming model

-
Objectives: 1 2 5 6
Contents:
Theory
4h
Problems
4h
Laboratory
16h
Guided learning
0h
Autonomous learning
30h

Components and design of a cluster architecture

-
Objectives: 3 6
Contents:
Theory
4h
Problems
4h
Laboratory
6h
Guided learning
4h
Autonomous learning
20h

Other parallel programming models: MPI

-
Objectives: 4 5 6
Contents:
Theory
4h
Problems
4h
Laboratory
8h
Guided learning
0h
Autonomous learning
20h

Final Exam


Objectives: 1 2 3 4
Week: 15 (Outside class hours)
Type: final exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
8h

Teaching methodology

The theory lessons introduce the knowledge, techniques, and concepts using examples of real code or pseudo-code. These lessons will be complemented with the realization of problems in the practical lessons. The laboratory sessions put into practice the theoretical contents, and evaluate the behavior and performance of the solutions proposed.

The course assumes that part of the theoretical contents, or laboratory statements, will have to be developed by the student independently.

The course is mainly focused on cluster architectures, using the C programming language, the Pthreads library and the OpenMP and MPI programming models.

Evaluation methodology

The grade for the course is calculated from 3 grades:
- theoretical contents grade;
- laboratory grade;
- autonomy and motivation grade.

The theoretical contents grade (T) is obtained from the marks contained in the midterm (50%) and final exam (50%). The laboratory grade (L) is obtained from the marks of the laboratory deliverables and monitoring of the laboratory sessions by the professor.

The grade of autonomy and motivation (A) evaluates the ability of students to face situations of lack of information and their motivation to explore additional topics or go beyond what is initially assigned. It is obtained from the results of those laboratory experiments that require the exploration of extra material and/or performing optional/free parts.

The final grade is calculated F = T * 0.4 + L * 0.4 + A * 0.2.

Bibliography

Basic:

Complementary:

  • Unit 1: POSIX Threads (Pthreads) programming - Eduard Ayguadé, Departament d'Arquitectura de Computadors , 2021.
  • Unit 2: Build (on paper) your own cluster architecture - Eduard Ayguadé, Departament d'Arquitectura de Computadors , 2021.
  • Unit 3: MPI (Message Passing Interface) - Eduard Ayguadé, Departament d'Arquitectura de Computadors , 2021.
  • Laboratory assignments: Lab 1 - OpenMP parallelisation of Eratosthenes Sieve - Eduard Ayguadé, Departament d'Arquitecture de Computadors , 2023.
  • Laboratory assignments: Lab2 - Implementing a minimal OpenMP runtime - Eduard Ayguadé, Departament d'Arquitectura de Computadors , 2021.
  • Laboratory assignments: Lab 3 - Performance characteritzation of HPC clusters - Eduard Ayguadé I Lluc Àlvarez, Departament d'Arquitectura de Computadors , 2023.
  • Laboratory assignments: Lab 4 - Heat equation using MPI - Eduard Ayguadé I Lluc Àlvarez, Departament d'Arquitectura de Computadors , 2023.

Web links

Previous capacities

Defined by the pre-requisites for the course