Parallelism

Credits
6
Types
Compulsory
Requirements
  • Prerequisite: AC
Department
AC
The subject Parallelism covers the fundamental aspects related to parallel programming, a basic tool today to take advantage of the multi-core architectures that constitute current computers. The course includes a description of the main strategies for task and data decomposition, as well as the mechanisms to ensure its correctness (synchronization, mutual exclusion, ...) and ways to improve their performance.

Teachers

Person in charge

  • Daniel Jimenez Gonzalez ( )

Others

  • Eduard Ayguadé Parra ( )
  • Gladys Miriam Utrera Iglesias ( )
  • Jesus Jose Labarta Mancho ( )
  • Jordi Tubella Murgadas ( )
  • Josep Ramon Herrero Zaragoza ( )
  • Mario Cesar Acosta Cobos ( )
  • Pedro José Martínez Ferrer ( )
  • Rosa Maria Badia Sala ( )

Weekly hours

Theory
2
Problems
0
Laboratory
2
Guided learning
0.4
Autonomous learning
5.6

Competences

Technical Competences

Common technical competencies

  • CT1 - To demonstrate knowledge and comprehension of essential facts, concepts, principles and theories related to informatics and their disciplines of reference.
    • CT1.1B - To demonstrate knowledge and comprehension about the fundamentals of computer usage and programming. Knowledge about the structure, operation and interconnection of computer systems, and about the fundamentals of its programming.
  • CT5 - To analyse, design, build and maintain applications in a robust, secure and efficient way, choosing the most adequate paradigm and programming languages.
    • CT5.1 - To choose, combine and exploit different programming paradigms, at the moment of building software, taking into account criteria like ease of development, efficiency, portability and maintainability.
    • CT5.3 - To design, write, test, refine, document and maintain code in an high level programming language to solve programming problems applying algorithmic schemas and using data structures.
    • CT5.6 - To demonstrate knowledge and capacity to apply the fundamental principles and basic techniques of parallel, concurrent, distributed and real-time programming.
  • CT6 - To demonstrate knowledge and comprehension about the internal operation of a computer and about the operation of communications between computers.
    • CT6.2 - To demonstrate knowledge, comprehension and capacity to evaluate the structure and architecture of computers, and the basic components that compound them.
  • CT7 - To evaluate and select hardware and software production platforms for executing applications and computer services.
    • CT7.2 - To evaluate hardware/software systems in function of a determined criteria of quality.
  • CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.
    • CT8.1 - To identify current and emerging technologies and evaluate if they are applicable, to satisfy the users needs.

Transversal Competences

Third language

  • G3 [Avaluable] - To know the English language in a correct oral and written level, and accordingly to the needs of the graduates in Informatics Engineering. Capacity to work in a multidisciplinary group and in a multi-language environment and to communicate, orally and in a written way, knowledge, procedures, results and ideas related to the technical informatics engineer profession.
    • G3.2 - To study using resources written in English. To write a report or a technical document in English. To participate in a technical meeting in English.

Objectives

  1. Ability to formulate simple performance models given a parallelization strategy for an application, that allows an estimation of the influence of major architectural aspects: number of processing elements, data access cost and cost of interaction between processing elements, among others.
    Related competences: CT7.2,
  2. Ability to measure, using instrumentation, visualization and analysis tools, the performance achieved with the implementation of a parallel application and to detect factors that limit this performance: granularity of tasks, equitable load and interaction between tasks, among others.
    Related competences: CT7.2,
  3. Ability to compile and execute a parallel program, using the essential command-line tools to measure the execution time.
    Related competences: CT7.2, CT5.3,
  4. Ability to apply simple optimizations in parallel kernels to improve their performance for parallel architectures, attacking the factors that limit performance.
    Related competences: CT7.2, CT6.2,
  5. Ability to choose the most appropriate decomposition strategy to express parallelism in an application (tasks, data).
    Related competences: CT5.1,
  6. Ability to apply the basic techniques to synchronize parallel execution, avoiding race conditions and deadlock and enabling the overlap between computation and interaction, among others.
    Related competences: CT5.1,
  7. Ability to program in OpenMP the parallel version of a sequential application.
    Related competences: CT5.3, CT5.6,
  8. Ability to identify the different types of parallelism that can be exploited in a computer architecture (ILP, TLP, and DLP within a processor, multiprocessor and multicomputer) and describe its principles of operation.
    Related competences: CT8.1, CT6.2, CT1.1B,
  9. Ability to understand the basics of coherence and data sharing in shared-memory parallel architectures, both with uniform and non-uniform access to memory.
    Related competences: CT8.1, CT6.2, CT1.1B,
  10. Ability to follow the course using the materials provided in English (slides, laboratory and practical sessions), as well as to do the mid-terms and final exams with the statement written in English.
    Related competences: G3.2,
  11. If the foreign language competence is chosen, the ability to write the deliverables associated with laboratory assignments (partially or fully) in English.
    Related competences: G3.2,

Contents

  1. Introduction and motivation
    Need for parallelism, parallelism vs. concurrency, possible problems using concurrency: deadlock, lifelock, starvation, fairness, data races
  2. Analysis of parallel applications
    Basic metrics: parallelism, execution time, speedup and scalability. Analysis of the impact of the overheads associated with the creation of tasks and their synchronization and data sharing. Tools for the prediction and analysis of parallelism and behavior visualization: Paraver and Tareador
  3. Shared-memory programming: OpenMP
    Parallel regions, threads and tasks. Synchronization mechanisms between tasks and threads. Static/dynamic work distribution, granularity.
  4. Introduction to parallel architectures
    Parallelism within a processor (ILP, DLP and TLP) and between the processors that make up the SMP and ccNUMA shared memory multiprocessors (cache consistency, memory consistency, synchronization).
  5. Parallel programming principles: task decomposition
    Task decomposition vs. data decomposition. Decomposition into tasks, granularity and dependency analysis. Identification of parallelism patterns: iterative vs. divide and conquer task decompositions. Mechanisms to implement the decomposition into tasks: creation of parallel regions and tasks; mechanisms to guarantee task ordering and data sharing.
  6. Parallel programming principles: data decomposition
    Data decomposition (geometric decomposition vs. recursive structures) for architectures with shared memory. Locality in data access in parallel shared memory architectures. Code generation based on data decomposition. Brief introduction to distributed memory architectures and their programming (specific case: MPI).

Activities

Activity Evaluation act


Assimilation of fundamental concepts and tools for modeling and analyzing the behavior of parallel applications

Actively participate in sessions of theory/problems. Study the contents of Units 1 and 2 and perform the proposed exercises. Resolution of the assignments in the laboratory sessions and understanding of the obtained results.
Objectives: 1 3 2 10
Contents:
Theory
6h
Problems
0h
Laboratory
6h
Guided learning
0h
Autonomous learning
10h

Using OpenMP to express of parallelism in shared memory

Actively participate in laboratory sessions. Do the suggested previous work/reading, solve the exercises during the laboratory sessions, analyse the obtained results, draw conclusions from the experiments and prepare the corresponding deliveries.
Objectives: 4 7 10 11
Contents:
Theory
1h
Problems
0h
Laboratory
4h
Guided learning
0h
Autonomous learning
4h

Assimilation of the fundamental aspects in parallel architectures

Actively participate in sessions of theory/problems. Study the contents of Unit 5 and perform the proposed exercises.
Objectives: 8 10
Contents:
Theory
6h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
6h

Assimilation of the fundamentals for task decomposition

Actively participate in sessions of theory/problems. Study the contents of Unit 4 and perform the proposed exercises. Apply new knowledge when solving the associated laboratory assignments.
Objectives: 5 6 10
Contents:
Theory
6h
Problems
0h
Laboratory
10h
Guided learning
0h
Autonomous learning
20h

Extra doubt session for the partial exam

The student can make the request for the problems he wants to review in advance, but he can also make requests during the session.
Objectives: 8 9 1 3 2 4 10
Contents:
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Midterm exam


Objectives: 9 1 5 6 7 10
Week: 7
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
10h

Training session: review of the solutions to the problems of the partial exam and general feedback of the errors found.

With this training session, students will be able to finish assimilating concepts from the first half of the semester.
  • Guided learning: Realization of a review session with the solutions to the questions of the partial control. This extra problem solving session should help give general partial control feedback to all students who want to come to the session.
Objectives: 8 9 1 3 2 10
Contents:
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Assimilation of the fundamentals for data decomposition

Actively participate in sessions of theory/problems. Study the contents of Unit 6 and perform the proposed exercises. Apply new knowledge when solving the associated laboratory assignments.
Objectives: 5 6 10
Contents:
Theory
6h
Problems
0h
Laboratory
10h
Guided learning
0h
Autonomous learning
14h

Extra doubt session for the final exam

The student can make the request for the problems he wants to review in advance, but he can also make requests during the session.
Objectives: 8 9 1 3 2 4 5 6 7 10
Contents:
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Final exam (Theory and Laboratory)

The laboratory part will be different from the theory part and will be a written exam on paper, related to what the students have worked on during the course.
Objectives: 8 9 1 3 2 4 5 6 7 10
Week: 15 (Outside class hours)
Theory
3h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
20h

Teaching methodology

The theory classes introduce all the knowledge, techniques, concepts needed to be put into practice problems in class and lab as well as personal work using a collection of problems. The two hours of laboratory sessions are also held weekly; active participation and performance during the laboratory sessions will be valued (work during the session, advancing as far as possible in order to achieve the objectives of each session). The course uses the C programming language and the OpenMP parallel programming model.

Evaluation methodology

The grade for the course (NF) is calculated based on the following components (all assessed out of 10):
- P: the mark of the mid-term exam (includes Units 1 to 3)
- FT: the mark in the theory part of the final exam (Units 1 to 5)
- FL: the laboratory mark in the laboratory part of the final exam (25%)

Additionally, they will be evaluated continuously:

- SL: laboratory follow-up reports (10%) which will also be used to evaluate the foreign language competence. IMPORTANT: Completion and presentation of all laboratory follow-up reports is a necessary condition to pass the subject. Only that report with a minimum of content is considered a report prepared and presented. Empty reports or with only the questions, for example, are not considered completed or submitted.
- AA: the mark of the online activities via Atenea carried out within the established period


applying the weighting indicated below:
N = 0.65*max(FT, 0.35*P+0.65*FT) + 0.25*FL + 0.10*SL
If N>=5.0 then NF = MIN(10, N * (1 + AA/100)); if not NF=N.

The final laboratory exam will be a written exam (on paper) that will be held on the same day as the final exam.


The foreign language competence will be evaluated from the reports delivered for the laboratory assignments. These reports should be written in English and they will require reading the laboratory assignment description (also in English) as well as the OpenMP specifications. Both the structure of the written document and the ability to transmit the results and conclusions of the work will be used to evaluate the competence. The grade for the competence will be A (excellent), B (good), C (satisfactory), D (fail) or NA (Not evaluated).

Bibliography

Basic:

Complementary:

  • Parallelism - Unit 1: Why Parallel Computing - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism - Unit 2: Understanding Parallelism - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism - Unit 3: Introduction to parallel architectures - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism - Unit 4: Mastering your task decomposition strategies: going some steps further - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism - Unit 5: Data-aware task decomposition strategies - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism: Collection of Exercises - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism: Selection of Exams (with Solutions) - Ayguade, E.; Ramon Herrero, J.R.; Jimenez, D.; Utrera, G, Departament d'Arquitectura de Computadors , 2022.
  • Parallelism Laboratory Assignments - Ayguadé, E... [et al.], Departament d'Arquitectura de Computadors , 2022.
  • Computer architecture: a quantitative approach - Hennessy, J.L.; Patterson, D.A, Elsevier, Morgan Kaufmann , 2019. ISBN: 9780128119051
    https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004117509706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
  • Parallel computer architecture: a hardware/software approach - Culler, D.E.; Singh, J.P.; Gupta, A, Morgan Kaufmann Publishers , 1999. ISBN: 9781558603431
    https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001862689706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Previous capacities

The capabilities are defined by the prior pre-requisites for the course.