Supercomputers Architecture

You are here

Credits
6
Types
Specialization complementary (High Performance Computing)
Requirements
This subject has not requirements

Department
AC
Supercomputers represent the leading edge in high performance computer technology. This course will describe all elements in the system architecture of a supercomputer, from the shared memory multiprocessor in the compute node, to the interconnection network and distributed memory cluster, including infrastructures that host them. We will also discuss the their building blocks and the system software stack, including their parallel programming models, exploiting parallelism is central for greater computational power . We will introduce the continuous development of supercomputing systems enabling its convergence with the advanced analytic algorithms required in today's world. At this point, We will pay special attention to Deep Learning algorithms and its executions on a GPU platform.

The practical component is the most important part of this subject. In this course the “learn by doing” method is used, with a set of Hands-on, based on problems that the students must carry out throughout the course. The course will be marked by continuous assessment which ensures constant and steady work. The method is also based on teamwork and a ‘learn to learn' approach reading and presenting papers. Thus the student is able to adapt and anticipate new technologies that will arise in the coming years. For the Labs we will use supercomputing facilities from the Barcelona Supercomputing Center (BSC-CNS). Updated information could be found un this web page http://www.JordiTorres.Barcelona/SA-MIRI-2017

Teachers

Person in charge

  • Jordi Torres Viñals ( )

Weekly hours

Theory
2
Problems
0
Laboratory
2
Guided learning
0.15
Autonomous learning
4

Competences

Technical Competences of each Specialization

High performance computing

  • CEE4.1 - Capability to analyze, evaluate and design computers and to propose new techniques for improvement in its architecture.
  • CEE4.2 - Capability to analyze, evaluate, design and optimize software considering the architecture and to propose new optimization techniques.
  • CEE4.3 - Capability to analyze, evaluate, design and manage system software in supercomputing environments.

Generic Technical Competences

Generic

  • CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.

Transversal Competences

Teamwork

  • CTR3 - Capacity of being able to work as a team member, either as a regular member or performing directive activities, in order to help the development of projects in a pragmatic manner and with sense of responsibility; capability to take into account the available resources.

Basic

  • CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
  • CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.
  • CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.

Objectives

  1. To train students to follow by themselves the continuous development of supercomputing systems that enable the convergence of advanced analytic algorithms and big data technologies driving new insights based on the massive amounts of available data.
    Related competences: CB6, CB8, CB9, CTR3, CEE4.1, CEE4.2, CEE4.3, CG1,

Contents

  1. Course content and motivation
  2. Supercomputing Basics
  3. HPC Building Blocks
  4. HPC Software Stack
  5. Parallel Computer Architecturels
  6. Parallel Programming Models
  7. Parallel Performance Metrics and Measurements
  8. Benchmarking in Supercomputers
  9. Coprocessors and Programming Models
  10. Powering Artificial Intelligence, Machine Learning and Deep Learning with Supercomputing
  11. Case study: current DL platforms and its software stack
  12. Towards Exascale Computing
  13. Conclusions and remarks

Activities

Course content and motivation

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2
Objectives: 1
Contents:

Supercomputing Basics

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

HPC Building Blocks

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

HPC Software Stack

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Parallel Computer Architecture

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Parallel Programming Models

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Parallel Performance Metrics and Measurements

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Benchmarking in Supercomputers

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Coprocessors and Programming Models

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Powering Artificial Intelligence, Machine Learning and Deep Learning with Supercomputing

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Case study: current DL platforms and its software stack

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Towards Exascale Computing

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

Conclusions and remarks

Theory
2
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
2

1- Supercomputing Building Blocks: Marenostrum visit

Theory
0
Problems
0
Laboratory
2
Guided learning
0.2
Autonomous learning
2

2- Getting Started with Supercomputing

Theory
0
Problems
0
Laboratory
2
Guided learning
0.2
Autonomous learning
2

3- Getting Started with Parallel Programming Models

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

4- Getting Started with Parallel Performance Metrics

Theory
0
Problems
0
Laboratory
2
Guided learning
0.2
Autonomous learning
2

5- Getting Started with Parallel Performance Model – I

Theory
0
Problems
0
Laboratory
2
Guided learning
0.2
Autonomous learning
2

6- Getting Started with Parallel Performance Model – II

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

7- Getting Started with GPU based Supercomputing

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

8- Getting Started with CUDA programming model

Theory
0
Problems
0
Laboratory
2
Guided learning
0.2
Autonomous learning
2

9- Getting Started with merging MPI and CUDA in a distributed GPU cluster

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

10- Getting Started with Deep Learning Frameworks

Theory
0
Problems
0
Laboratory
2
Guided learning
0.2
Autonomous learning
2

11- Getting Started with a Deep Learning real problems an its solutions

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

12- Getting Started with scalability of Deep Learning problems -I

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

13- Getting Started with scalability of Deep Learning problems -II

Theory
0
Problems
0
Laboratory
2
Guided learning
0.1
Autonomous learning
2

Teaching methodology

The theoretical part of the course will follow the slides designed by the teacher during theory class The practical component is the most important part of this subject. In this course the “learn by doing” method is used, with a set of Hands-on, based on problems that the students must carry out throughout the course. The course will be marked by continuous assessment which ensures constant and steady work. The method is also based on teamwork and a ‘learn to learn' approach reading and presenting papers. Thus the student is able to adapt and anticipate new technologies that will arise in the coming years.
Course Activities:

Class attendance and participation: Regular and consistent attendance is expected and to be able to discuss concepts covered during class.

Lab activities: Hands-on sessions will be conducted during lab sessions using supercomputing facilities. Each hands-on will involve writing a lab report with all the results to be delivered one week later.

Homework Assignments: Homework will be assigned weekly that includes reading documentation that expands the concepts introduced during lectures, and periodically will include reading research papers related with the lecture of the week, and prepare presentations (with slides).

Assessment: There will be some short midterm exams (and could be some pop quiz) along the course (as a part of theory class time). The student will be able to take an optional final exam to improve the score of the midterm exams.

Student presentation. Students/groups randomly chosen will present the homework (presentations/projects).

Evaluation methodology

The evaluation of this course will take into account different items:

A) Attendance (minimum 80% required) & participation in class will account for 15% of the grade.
B) Homework, papers reading, paper presentations, will account for 15% of the grade.
C) Exams will account of 15% of the grade.
D) Lab sessions (+ Lab reports) will account for 55% of the grade.

Bibliografy

Basic:

  • - Slides provided by the teacher that will content all the references (each year the documentation is updated), , . ISBN:
  • Understanding Supercomputing, to speed up machine learning algorithms - Jordi Torres, Ed. UPC, Barcelona 2017 , . ISBN:

Web links

Previous capacities

Programming in C and Linux basics will be expected in the course. Prior exposure to parallel programming constructions, experience with linear algebra/matrices or machine learning knowledge, will be very helpful.